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Abstract — It is well known that for ergodic channel processes 
the Generalized Max-Weight Matching (GMWM) scheduling 
policy stabilizes the network for any supportable arrival rate 
vector within the network capacity region. This policy, however, 
often requires the solution of an NP-hard optimization problem. 
This has motivated many researchers to develop sub-optimal 
algorithms that approximate the GMWM policy in selecting 
schedule vectors. One implicit assumption commonly shared in 
this context is that during the algorithm runtime, the channel 
states remain effectively unchanged. This assumption may not 
hold as the time needed to select near-optimal schedule vectors 
usually increases quickly with the network size. In this paper, we 
incorporate channel variations and the time-efficiency of sub- 
optimal algorithms into the scheduler design, to dynamically 
tune the algorithm runtime considering the tradeoff between 
algorithm efficiency and its robustness to changing channel 
states. Specifically, we propose a Dynamic Control Policy (DCP) 
that operates on top of a given sub-optimal algorithm, and 
dynamically but in a large time-scale adjusts the time given to the 
algorithm according to queue backlog and channel correlations. 
This policy does not require knowledge of the structure of the 
given sub-optimal algorithm, and with low overhead can be 
implemented in a distributed manner. Using a novel Lyapunov 
analysis, we characterize the throughput stability region induced 
by DCP and show that our characterization can be tight. We 
also show that the throughput stability region of DCP is at least 
as large as that of any other static policy. Finally, we provide 
two case studies to gain further intuition into the performance 
of DCP. 

Index Terms — Throughput stability region, dynamic tuning, 
channel variation, approximate GMWM time-efficiency 

I. Introduction 

The problem of scheduling of wireless networks has been 
extensively investigated in the literature. A milestone in this 
context is the seminal work by Tassiulas and Ephremides 0, 
where the authors characterized the network-layer capacity 
region of constrained queueing systems, including wireless 
networks, and designed a throughput-optimal scheduling pol- 
icy, commonly referred to as the GMWM scheduling. In this 
context, capacity region by definition is the largest region 
that can be stably supported using any policy, including those 
with the knowledge of future arrivals and channel states. A 
throughout-optimal policy is a policy that stabilizes the net- 
work for any input rate that is within the capacity region and, 
thus, has the largest stable throughput region. In general [3||4|, 
the GMWM scheduling should maximize the sum of backlog- 
rate products at each timeslot given channel states, which 
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can be considered as a GMWM problem. This problem has 
been shown to be, in general, complex and NP-hard [5|[4]|6|. 
Even in those cases where the optimization problem can be 
solved polynomially, distributed implementation becomes a 
major obstacle. These issues, naturally, motivated researchers 
to study and develop suboptimal centralized or distributed 
algorithms that can stabilize a fraction of the network-layer 
capacity region Q g) (U . 

One implicit but major assumption in this context is that 
the time required to find an appropriate scheduling vector, 
search-time, is negligible compared to the length of a timeslot, 
or otherwise, during this search-time, channel states remain 
effectively unchanged. Since many algorithms take polynomial 
time with the number of users to output a solution 1 5 1 [ 6 1 1 9 1 , we 
see that this assumption may not hold in practice for networks 
with large number of users. In particular, it is possible that 
once an optimal solution corresponding to a particular channel 
state is found, due to channel variations, it becomes outdated 
to the point of being intolerably far away from optimality. 

Intuitively, for many suboptimal algorithms, the solution 
found becomes a better and more efficient estimate of the 
optimal solution as the number of iterations increases or more 
time is given to the algorithm, e.g., see PTAS in |]6). This 
inspires us to consider this time-efficiency correspondence as 
a classifying tool for sub-optimal algorithms. As mentioned 
earlier, however, the solution found might become outdated 
due to channel variations. This poses a challenging problem 
as how the search-time given to sub-optimal algorithms should 
be adjusted to ensure an efficient scheduling with a large stable 
throughput region when channels states are time- varying. 

Our work in this paper addresses the above challenge by 
joint consideration of channel correlation and time-efficiency 
of sub-optimal algorithms. In particular, we propose a dynamic 
control policy (DCP) that operates on top of a given sub- 
optimal algorithm A, where the algorithm is assumed to 
provide an approximate solution to the GMWM problem. Our 
proposed policy dynamically tunes the length of scheduling 
frames as the search-time given to the algorithm A so as to 
maximize the time average of backlog-rate product, improving 
the throughput stability region. This policy does not require the 
knowledge of input rates or the structure of the algorithm A, 
works with a general class of sub-optimal algorithms, and with 
low-overhead can be implemented in a distributed manner. We 
analyze the performance of DCP in terms of its associated 
throughput stability region, and prove that this policy enables 
the network to support all input rates that are within ^-scaled 
version of the capacity region. The scaling factor 6^ is a 



2 



function of the interference model, algorithm A, and channel 
correlation, and we prove that in general this factor can be 
tight. We also show that the throughput stability region of 
DCP is at least as large as the one for any other static scheme 
that uses a fixed frame-length, or search-time, for scheduling. 

As far as we are aware, our study is the first that jointly 
incorporates the time-efficiency of sub-optimal algorithms and 
channel variations into the scheduler design and stability 
region analysis. One distinguishing feature of our work, apart 
form its practical implications, is the use of a Lyapunov 
drift analysis that is based on a random number of steps. 
Therefore, to establish stability results, we use a method 
recently developed for Markov chains [ 10 1, and modify it such 
that it is also applicable to our network model. 

The rest of this paper is organized as follows. We re- 
view the related work in the next section. Network model 
including details of arrival and channel processes is presented 
in Section [ill] Structures of the sub-optimal algorithms and 
DCP policy are discussed in Section [TV] We then provide 
performance analysis and the related discussion in Section [V] 
followed by two case studies in Section |VT] Finally, we 
conclude the paper in Section IVIII 

II. Related Work 

Previous work on throughput-optimal scheduling includes 
the studies in ll2l ifTTl 01 lTT2l . In particular, in Q, Tassiulas 
and Ephremides characterized the throughput capacity region 
for multi-hop wireless networks, and developed the GMWM 
scheduling as a throughput-optimal scheduling policy. This 
result has been further extended to general network mod- 
els with ergodic channel and arrival processes [3|. Due to 
its applicability to general multi-hop networks, the GMWM 
scheduling has been employed, either directly or in a modified 
form, as a key component in different setups and many cross- 
layer designs. Examples include control of cooperative relay 
networks fl2l , rate control |fl3l , energy efficiency [14||15|, 
and congestion control [16|[17|. This scheduling policy has 
also inspired pricing strategies maximizing social welfare [ 18 1, 
and fair resource allocation |16|. 

Another example of the throughput optimal control is the 
exponential rule proposed in IfTTl . In addition to the exponen- 
tial rule scheduling, there are other approaches that use queue 
backlog, either explicitly or implicitly, for scheduling [19| 
1 20 1 [2 1 '| . For instance, in fl9l , active queue management is 
used that implements CSMA protocol with backlog dependent 
transmission probabilities. It is shown that such an approach 
can implement a distributed fair buffer. In one other work [20|, 
an adaptive CSMA algorithm is proposed that iteratively adjust 
nodes' aggressiveness based on nodes' (simulated) queue 
backlog. 

The GMWM scheduling despite its optimality, in every 
timeslot, requires the solution of the GMWM problem, which 
can be, in general, NP-hard and Non-Approximable [6|. Thus, 
many studies has focused on developing sub-optimal constant 
factor approximations to the GMWM scheduling. One interest- 
ing study addressing the complexity issue is the work in [22 J, 
where sub-optimal algorithms are modeled as randomized 
algorithms, and it is shown that throughput-optimality can 



be achieved with linear complexity. In a more recent work 
(23l . the authors propose distributed schemes to implement a 
randomized policy similar to the one in [22 J that can stabilize 
the entire capacity region. These results, however, assume 
non-time-varying channels. Other recent studies in J4|[24| 
generalize the approach in ll22l to time-varying networks, and 
prove its throughput-optimality. This optimality, as expected, 
comes at the price of requiring excessively large amount of 
other valuable resources in the network, which in this case 
is memory storage. Specifically, the memory requirement in 
[4 1 [24 1 increases exponentially with the number of users, 
making the generalized approach hardly amenable to practical 
implementation in large networks. 

Another example of sub-optimal approximation is the work 
in O, where the authors assume that the controller can 
use only an imperfect scheduling component, and as an 
example they use maximal matching to design a distributed 
scheduling that is within a constant factor of optimality. This 
scheduling algorithm under the name of Maximal Matching 
(MM) scheduling and its variants have been widely studied 
in the literature Q J6] (25] (3 (26) (27] . In QO, it is shown 
that under simple interference models, MM scheduling can 
achieve a throughput (or stability region) that is at least 
half of the throughput achievable by a throughput-optimal 
algorithm (or the capacity region). Extended versions of these 
results for more general interference models are presented 
in 1161 (9l , where in [9| randomized distributed algorithms are 
proposed for implementing MM scheduling, being a constant 
factor away from the optimality. This result has been further 
strengthened recently [28 1 stating that the worst-case efficiency 
ratio of Greedy Maximal Matching scheduling in geometric 
network graphs under the K-hop interference model is between 
1/6 and 1/3. All of the mentioned proposals so far either do 
not consider channel variations, or assume the search-time is 
relatively small compared to the length of a timeslot. 

The closest work to ours in this paper is [8 1, where based on 
the linear-complexity algorithm in |22l . the impact of channel 
memory on the stability region of a general class of sub- 
optimal algorithms is studied. Despite its consideration for 
channel variations, this work still does not model the search- 
time, and implicitly assumes it is negligible. 

In this paper, we consider the problem of scheduling from 
a new perspective. We assume a sub-optimal algorithm A is 
given that can approximate the solution of the GMWM prob- 
lem, and whose efficiency naturally improves as the search- 
time increases. We then devise a dynamic control policy which 
tunes the search-time, as the length of scheduling frames, 
according to queue backlog levels in the network, and also 
based on channel correlations. As far as we are aware, our 
study is the first that explicitly models the time-efficiency 
of sub-optimal approaches, and uses this concept along with 
channel correlation in the scheduler design. 

III. Network Model 

We consider a wireless network with N one-hop source- 
destination pairs, where each pair represents a data flowO- 

1 Extension to multi-hop flows is possible using the methods in |2||3 |. 
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Associated with each data flow, we consider a separate queue, 
maintained at the source of the flow, that holds packets to 
be transmitted over a wireless link. Examples of this type of 
network include downlink or uplink of a cellular or a mesh 
network. 

A. Queueing 

We assume the system is time-slotted, and channels hold 
their state during a timeslot but may change from one timeslot 
to another. Let s(t) be the matrix of all channels states from 
any given node i to any other node j in the network at time 
t. For instance, when the network is the downlink or uplink 
of a cellular network, s(t) will reduce to the vector of user- 
base-station channel states, i.e., s(t) = (sx(t), . . . ,sjsr(t)), 
where Si(t) is the state of the i t h link (corresponding to the 
i t h data flow) at time t. Throughout the chapter, we use bold 
face to denote vectors or matrices. Let S represent the set of 
all possible channel state matrices with finite cardinality |<S|. 
Let Di(i) denote the rate over the link corresponding to 
the i t h data flow at time t, and D(t) be the corresponding 
vector of rates, i.e., D(t) = (Di(t), . . . , Djy(t)). In addition, 
let 7j(t) represent the amount of resource used by the i t h 
link at time t, and I(t) be the corresponding vector, i.e., 
I(t) = (J x (t),.-. ,I N (t)). The vector I(t) contains both 
scheduling and resource usage information, and hereafter, we 
refer to it simply as the schedule vector. Let X denote the set 
containing all possible schedule vectors, with finite cardinality 
\T\. 

Note that the exact specification of the scheduling vector 
I(t) is system dependent. For instance, in CDMA systems, 
it may represent the vector of power levels associated with 
wireless links; in OFDMA systems, it may represent the 
number of sub-channels allocated to each physical link; and 
when interference is modeled as the K-hop interference model 
1151 , the vector can be a link activation vector representing a 
sub-graph in the network. Assuming that transmission rates 
are completely characterized given channel states, the schedule 
vector, and the interference model, we have 

D(i)=D(s(t),I(i)). 

We assume that transmission rates are bounded, i.e., for all 
seS and I £ I, 

A(s,I) <D max , 1 <i< N, 

for some large D max > 0. 

Let Ai(t) be the number of packets arriving in timeslot 
t associated with the ith link (or data flow), and A(i) be 
the vector of arrivals, i.e., A(t) = (Ai (£),••• ,A N (t)). We 
assume arrivals are i.i.d@ with mean vector 

E[A(t)] = a = (oi, . . .,a N ), 

and bounded above: 

Ai(t) < A max , 1 < i < N, 

for some large A max . 

2 This assumption is made to simplify the analysis, and our results can be 
extended to non i.i.d arrivals. 



Finally, let X(i) = (Xi(t), . . . , X N (t)) be the vector of 
queue lengths, where Xi (t) is the queue length associated with 
the ith link (or data flow). Using the preceding definitions, we 
see that X(i) evolves according to the following equation 

X(f + 1) = X(i) + A(t) - D(t) + U(f), 

where U(t) represents the wasted service vector with non- 
negative elements; the service is wasted when in a queue the 
number of packets waiting for transmission is less than the 
number that can be transmitted, i.e., when Xi(t) < Di(t). 

B. Channel State Process 

We assume the channel state process is stationary and 
ergodic. In particular, for all s G S, as k — > oo, we have 

^ k—l 

where denotes the indicator function associated with a 
given event, and tt(s) is the steady-state probability of state s. 
Let Vt represent the past history of the channel process and be 
defined by Vt — {s(i);0 < i < t}. The above almost surely 
convergence implies that for any e > and ( > 0, we can 
find a sufficiently large K e £,t > such that ll29l 

. fc-i 

p ( su p I r !«(*+*)=• -'■"W I > 6 \ Vt ) < C- (!) 

We assume that the almost surely convergence is unform in 
the past history and t in the sense that regardless of Vt and t, 
there exists a K e ^ such that ((TJ holds with K e ^ tt — K e ^. 

C. Capacity Region 

In our context, capacity region, denoted by V, is defined 
as the closure of the set of all input rates that can be 
stably supported by the network using any scheduling policy 
including those that use the knowledge of future arrivals and 
channel states. In [2||30| and recently under general conditions 
in Q, it has been shown that the capacity region V is given 
by 

T = n ( s ) Convex-Hull{D(s, I)|I E 1}. 



IV. Dynamic Control Policy 

As mentioned in the introduction, DCP controls and tunes 
the search-time given to a sub-optimal algorithm to improve 
the stability region. The considered sub-optimal algorithms 
are assumed to provide a sub-optimal solution to the GMWM 
problem. In the following, we first elaborate on the structure 
of the sub-optimal algorithms, and then, describe the operation 
of DCP. 

3 Examples of this channel model include but are not limited to Markov 
chains. 
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A. Sub-optimal Algorithms Approximating GMWM Problem 

It is well known that the GMWM scheduling is throughput- 
optimal in that it stabilizes the network for all input rates 
interior to capacity region T. This policy in each timeslot 
uses the schedule vector I* (t) that is argmax to the following 
GMWM problem: 



l t h round %, round 



N 



max) Xi(t)Di(s(t),I), subject to I e I. (2) 



1=1 

However, as mentioned in Section Q] this optimization problem 
can be in general NP-hard. We therefore assume that there 
exists an algorithm A that can provide suboptimal solutions 
to the max-weight problem given in (0. To characterize the 
structure of algorithm A, let I*(X, s) be the argmax to (O by 
setting X(t) = X and s(t) = s. Thus, 

I*(X, s) = argmax XD(s, I), 

IGX 

where XD(s, I) is the scalar product of the two vectors, and 
for ease of notation, we have dropped the transpose symbol 
required for D(s, I). In the rest of this paper, we use the same 
method to show the scalar products. Associated with I* (X, s), 
let D*(X,s) be defined as 

D*(X,s)=D( S ,r(X,s)). (3) 

Thus, D* (X, s) is the optimal rate, in the sense of (|2), when 
the backlog vector is X and the channel state is s. 

Let F") be the output schedule vector of algorithm A when 
it is given an amount of time equal to n timeslots, X(t) = X, 
and s(t) = s. We therefore assume that the time given to 
algorithm A can be programmed or tuned as desired, or simply, 
the algorithm can continue or iterate towards finding better 
solutions over time. We assume that IW is in general a random 

(n) 

vector with distribution s - Since the objective function in 
d2} is a continuous function of X(t), we naturally assume that 
algorithm A characterized by the distribution of I> n >, for all 
n > 1, and all values of X and s, has the following property: 

Assumption 1: For all I G Z, s G S, and n, we have that 

|^) s (l(n) =I) _ M W s(I (n) =I) |^ 0) 

as Xi — > X2. In addition, assuming and keeping ||Xi— X 2 | < 
C for a given C > 0, the above convergence also holds when 
||Xij| — > 00. Moreover, the convergence becomes equality if 
Xi = /3X 2 , for some /3 > 0. 

In the following, we discuss concrete models that provide 
further details on the structure of algorithm A. Note that these 
models serve only as examples, and our results do not depend 
on any of these models; what required is only Assumption Q] 

The first model arises from the intuition that the distribution 

in) 

Mxs should improve as n increases. More precisely, we 
can define the sequence {pJv s , n = 1, 2, 3, • • ■ } to be an 
improving sequence if for all n > 1, 

E[XD(s,I (n) )] > E[XD(s,I (n_1) )] > ••■ > E[XD(s,I (1) )]. 

The first model uses the above and defines a natural algorithm 
to be the one for which the above inequalities hold for all 
values of X and s. 
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Fig. 1 . Illustration of scheduling rounds, test intervals, update intervals, and 
frames. 

As for the second model, we may have that l( n ) is such 
that 



XD( s ,lW)> s (n)XD(s,r(X, s )), 



(4) 



where the function g(n) is a non-decreasing function of n, 
and less than or equal to one. For instance, if the optimization 
problem can be approximated to a convex problem [31], then 
g (n) = £(l ~ where < £ < 1 and < C < 1. Another 
possible form for g(n) is 

( ^ Inn ) ' 

where (3 is a positive constant. This form of g{n) may stem 
from cases where the optimization problem associated with 
(f2]i admits Polynomial-Time Approximation Scheme (PTAS) 
0. 

The last model that we consider is a generalization of the 
previous model, where we assume that (0]i holds with probabil- 
ity h(n) as a non-decreasing function of n. This specification 
can model algorithms that use randomized methods to solve 
(ffji, and without its consideration for the improvement over n, 
is similar to the ones developed in [22|[8|. 

B. Dynamic Control Policy and Scheduling 

The dynamic control policy in this paper interacts with 
scheduling component, and through some measures, which 
will be defined later, dynamically tunes the time spent by the 
scheduler, or more precisely algorithm A, to find a schedule 
vector. In what follows, we describe the joint operation of 
DCP and the scheduler. 

As DCP operates, the time axis becomes partitioned to 
a sequence of scheduling rounds, where each round might 
consist of a different number of timeslots. An illustrative 
example is provided in Fig. Q] Let tk denote the start time of 
the fc t h round. Each round begins with a test interval followed 
by an update interval. In the beginning of the test interval 
of each round, a candidate value for the number of timeslots 
given to the algorithm A to solve (ffjl is selected by DCP. 
Let Ni(tk) denote this candidate value for the fcth round, and 
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assume N±(t k ) S Af%, where A/i has a finite cardinality. In 
the rest, we use N( instead of Ni(t k ) where appropriate. 
The algorithm that chooses the candidate value might be in 
general a randomized algorithm. Thus, we use the superscript 
r to make this point clear. We assume N[ takes an optimal 
value with probability 5 > 0, where optimality will be defined 
later by (0 and its following discussion. 
We set the length of the test interval to be 

N[N% = N C = const., 

a multiple of N[, where ATJ is adjusted accordingly so that the 
test interval has a fixed length N c . Therefore, given N{, the 
test interval becomes partitioned into N£ consecutive frames 
of N[ timeslots. In the beginning of each frame, e.g., at time 
t, the current backlog vector X(i) and channel state s{t) 
are provided to the algorithm A. The algorithm then spends 
N{ timeslots to find a schedule vector. Depending on the 
properties of a particular instance of algorithm A, this vector 
is used by the scheduler to update scheduling decisions in the 
next frame in a variety of methods. 

In the first method, the schedule vector found after N{ 
timeslots in the frame starting at time t is used throughout 
the next frame of N{ timeslots starting at time t + N{. Thus, 
the schedule vector used in any frame is obtained by using 
backlog and channel state information at the beginning of its 
previous frame. This method is general and can be applied to 
all types of algorithm A. 

We can apply a second method where algorithm A is 
capable of outputting schedule vectors in intermediate steps, 
and not only after the planned N{ timeslots. Consider the i t h 
timeslot of a given frame of N{ timeslots started at time t, 
where i < N±. Suppose If +i is the intermediate solution found 
by the algorithm A after i timeslots in the considered frame, 
and I p is the vector found at the end of its previous frame. 
Then, we may assume that with some probability, I£ , i is used 
if 

X(t + i)D(s(i + i),I^) > X(t + i)D(s(t + i),I p ), 

otherwise Ip is used in the timeslot following the i t h timeslot. 
The update rule in (8) provides an example where two schedule 
vectors are compared, and the best is selected with a well- 
defined probability. 

As for the third method, we may assume algorithm A 
can accept an initial schedule vector. In this case, we can 
assume that the algorithm A at a given frame accepts the 
schedule vector found in the previous frame as the initial point 
to the optimization problem of (|2)- Note that many graph- 
inspired algorithms do not start from a given initial vector 
(as a sub-graph), but instead, gradually progress towards a 
particular solution. These algorithm^, therefore, do not belong 
to the class of algorithms considered for this method. A forth 
method can also be considered by mixing the second and the 
third method if algorithm A has the corresponding required 
properties. Our results in this paper extend to these methods 
as long as Property [TJ and Property [2] in Section IV-BI hold. 

4 Adaption of these algorithm to time-varying networks is an interesting 
problem, and is left for the future research. 



Given N[, and a method to use the output of algorithm 
A, DCP evaluates scheduling performance resulting from the 
value for N[. The performance criterion is the normalized 
time-average of the backlog-rate (scalar) product. To define 
the criterion precisely, let ip(-,-,-) be defined as 



"2-1 "1-1 v 

x — * •' * 



j=0 i=0 



t +jn 1 +i'Dt+jn 1 +i 

nin 2 ||Xt|| 



If ||X f || = 0, we set (p(t,ni,ri2) = 0. Based on the above 
definition, the criterion associated with the test interval of the 
fcth scheduling round, which is computed by DCP, is denoted 
by ip r (ik), where 

This quantity is then used to determine the length of frames 
in the update interval of the fc t h round. 

Update intervals are similar to the test intervals in that they 
are consisted of a multiple number of fixed-length frames. 
More precisely, we assume that the update interval in the 
fc t h round becomes partitioned into N 2 (tk)N 3 (tk) consecutive 
frames of Ni(ik) timeslots. Integers Ni(tk) and A^ife) are 
such that 



A^i(4)7V 2 (4) = A^ c 



(5) 



Therefore, the length of the fc t h update interval is N 3 (t k ) times 
the length of a test interval. Moreover, we see that N\(t}.) in 
the fc t h update interval takes the role of N[(ik) in the fcth 
test interval. Assuming the same method is applied to all test 
and update intervals to use the output of algorithm A, we can 
properly define f{t k ) as 

¥»(**) = <p{h + N c , N l (i k ),N 2 (h)N i (i k )). 

The quantity tp(t k ) is similar to ip r (tk), and measures the 
normalized time-average of backlog-rate product in the fcth 
update interval. 

DCP , on top of algorithm A, uses (p(t k -x) and (p r (tk) to 
dynamically control the value of Ni(t k ) and N 3 (t k ) over time. 
Specifically, in the fc t h round, at the end of the test interval, 
the policy chooses either the N\ used in the previous update 
interval, Ni(t k -\), or the newly chosen value of Ni in the 
current test interval, N^(t k ), according to the following update 
rule: 



if ip r {i k ) > Lp(i k ^ 
otherwise 



ATi(4-] 

where a is a suitably small but otherwise an arbitrary positive 
constant. At the same time, the value of N^(tk), is updated 
according to the following: 



N 3 (i k ) 



max(l, jVa(t 2 fc - l) ) 
min(ii,2AT 3 (4-i)) 



if (p r (i k ) > ¥>(4-i) + " 
otherwise, 



where L\ is a suitably large but otherwise an arbitrary positive 
constant. Note that A<2(4) becomes updated such that (0 
holds. Once the values of Ni, A^, and A3 are updated, in 
the rest of the scheduling round, which by definition is the 
update interval, the policy proceeds with computing the time 
average (p(t k ). When the fc t h round finishes, the fc + l t h round 
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starts with a test interval, and DCP proceeds with selecting 
N r (tk+i), and applying the update rule at the end of the fc+l t h 
test interval. This completes the description of joint operation 
of DCP and the scheduling component. 

Considering the above description, we see that DCP keeps 
trying new values for N±. Once a good candidate is found 
for N\, the update rule with high probability uses this value 
for longer periods of time by doubling the length of update 
intervals. In case the performance in terms of the backlog-rate 
product degrades, the length of update intervals are halved 
to expedite trying new values for N\. Note that a can be 
arbitrarily small, but should be a positive number. This avoids 
fluctuations between different values of N± performing closely, 
thus preventing short update intervals. In addition, it limits 
incorrect favoring towards new values of Ni in the test 
intervals, where due to atypical channel conditions, the nor- 
malized backlog-rate product deviates from and goes beyond 
its expected value. Finally, note that Li can be arbitrarily 
large, but should be a finite integer. This assumption is mainly 
analysis-inspired but is also motivated by the fact that a larger 
L\ can lead to a larger delay. 

V. Performance Analysis 

In this section, we evaluate the performance of DCP in terms 
of its associated stability region. We first introduce several key 
definitions and functions, and then state the main theorem of 
the paper. 

A. Definitions 

Since the backlog vector is non-Markovian, we consider the 
following definition for the stability of a process. 

1 ) Stability: Suppose there are a bounded closed region C 
around the origin, and a real- valued function F(-) > such 
the following holds: For any t, and oq defined by 



we have 



ac = vai{i > : X t+i e C}, 



E[a c ] < F(X(t))l x{mc - 



Then, the system is said to be stable. 

This definition implies that when X(t) ^ C, e.g., when 
||X(£)|| is larger than a threshold, the conditional expectation 
of the time required to return to C, e.g., so that ||X(t)| 
becomes less than or equal to the threshold, is bounded by 
a function of only X(t), uniformly in the past history and 
t. This definition further implies that if the sequence X(t) is 
stable, then ll32l 



lim supP(|X(i)| > k) 



0. 



2) 9 -seeded Region and Maximal Stability: Suppose < 
9 < 1. A region is called 6>-scaled of the region T, and denoted 
by 6T, if it contains all rates that are 0-scaled of the rates in 
T, i.e., 

9T = {ai : ai = 9a.2, for some &2 eT}. 

Further, the 0-scaled region is called maximally stable if for all 
arrival rate vectors interior to 6T, the system can be stabilized, 
and for all e > there exists at least one rate vector interior to 



(9 + e)T that makes the system unstable, both under the same 
given policy. Thus, maximal stability determines the largest 
scaled version of T that can be stably supported under a given 
policy. 

B. Auxiliary Functions and Their Properties 

To define the first function, hypothetically suppose for all t, 
X(t) = X for a given X, X ^ 0, and thus, X(t) does not get 
updated. In addition, assume that Ni has a fixed value over 
time. Considering these assumptions and an update interval of 
infinite number of framefl each consisting of Ni timeslots, 
we can see that in the steady state, the expected normalized 
backlog-rate product, averaged over one frame, is equal to 

(E^XDi) 



(X,JVi) =E B 



(6) 



JVi||X|| ' 

where D ; is the rate vector in the i t j, timeslot of a given 
frame in the steady state. This expectation is over the steady- 
state distribution of channel process, and possibly over the 
randomness introduced by the algorithm A. 

Intuitively, <j)(X, N\) states how well a particular choice for 
Ni performs, in terms of backlog-rate product, when queue- 
length changes are ignored. This is exactly what we need to 
study since the stability region often depends on the behavior 
of scheduling at large queue-lengths, where in a finite window 
of time the queue-lengths do not change significantly. 

To simplify notation, where appropriate, we use t as the 
first argument of (/)(•, •); by that we mear@ 

</>(t,N 1 )=<l>(X(t),N 1 ). 

Having defined <£(X,iVi), we define A>i(X) and 4>(t) by 



JVi(X) = argmax0(X,A/i) 



(7) 



and 



4>(t) = 4>(x(t)) = <f>(x(t),N 1 (x(t)) 

Finally, for a given X with ||X|| ^ 0, we define 

-XD*(X,s) 



X(X) = E s 



where D*(X, s) is defined in (O, and the expectation is over 
the steady-state distribution of the channel process. 

According to the above definitions, we see that when 
variations in the backlog vector are ignored after time t, and 
N\ is confined to have a fixed value, Ni(X(t)) becomes the 
optimal value for N\ in terms of the normalized backlog-rate 
product, and <fi(t) represents the corresponding expected value. 
In particular, note that iVi(X) is a function of X and may take 
different values for different X's. The quantity on the 

other hand, is the expected normalized backlog-rate product 
if for all states we could find the optimal schedule vector. 
This quantity, therefore, can serve as a benchmark to measure 
performance of sub-optimal approaches. 

5 Here, we assume the channel evolves, and that the algorithm A is used 
in the same manner as it is used in an ordinary update interval with a finite 
N c , as discussed in Section HV-B I 

6 By definition of </>(■, ■), here we hypothetically assume the backlog vector 
X(ti) for all times ti is equal to X(t). 
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Note that x(X) is continuous function of X and does not 
depend on ||X||. Similarly, by Assumption Q] </>(X, Ni) does 
not depend on ||X||, and is expected to have the following 
property. 

Property 1: Suppose ||Xi — X2II < C for a given C > 0. 
For any given e > 0, there exists a sufficiently large M > 
such that if ||Xi|| > M, then for all N± G M 



If the first or the second method in Section HV-BI is used, this 
property holds since by Assumption Q] algorithm A statisti- 
cally finds similar schedule vectors when two backlog vectors 
are close and large. In case the third or the forth method 
is used, it is possible to consider explicit restrictions for 
algorithm A such that <p(K, Ni) is well-defined and PropertyQ] 
holds. However, in this paper, we simply assume that algorithm 
A is well-structured, in terms of the distribution of l' n \ so 
that by the ergodicity of the channel process this property 
also holds for these methods. 

Recall that tp r (tk) is the normalized time-average of 
backlog-rate product over the fc t h test interval. If we assume 
that the backlog vector is kept fixed at X(f/ C ), by ergodicity 
of the channel process as explained in Section IIII-BI we 
expect tp r (tk) to converge to 4>(tk, N[(tk))- Hence, when the 
number of frames is large, which is the case when N c is large, 
<f r {tk) should be close to 4>{tk, -^T(*fe)) with high probability. 
However, the backlog vector is not fixed and changes over 
time. But by Assumption Q] algorithm A statistically responds 
similarly to different backlog vectors if they are close and 
sufficiently large. This can be exactly our case since arrivals 
and departures are limited, and thus, for a fixed N c , the 
changes in the norm of backlog vector are bounded over 
one test interval. Therefore, by Assumption Q] if ||X(ffc)|| is 
sufficiently large, the changes in the backlog have little impact 
on the distribution of ip r (tk). Applying a similar discussion 
to <p(ik) while noting that the length of update intervals is 
bounded by L\N C , we expect the following property. 

Property 2: There exist g v > and 9 V > such that for 
any given e > 0, there exists M > such that if ||Xj || > M, 
then regardless of k and the past history, up to and including 
time tk, with probability at least (1 — g^) 

\ip r (t k )-4>(i k ,N[(t k ))\ <6 v + e. 

Similarly, regardless of k and the past history, up to and 
including time t k + N c , with probability at least (1 — g v ) 



\<p{ik)-^{ik + N ,N 1 (ik))\ < 



+ e. 



Moreover, 



N, 



lira g v = lim 



No 



0. 



According to the preceding discussion, we can see that 9 V 
and g v mainly measure how fast the time-averages converge to 
their expected value, and e models the error due to variations 
in the backlog vector X^ +i . Thus, as stated above, g v and 9 V 
can be made arbitrarily small by assuming a sufficiently large 
value for N c . In a practical implementation, however, N c is 
a limited integer, and therefore, 6 V > and g v > 0. Note 



that when the first or the second method in Section IIV-BI is 
used, Property |2] holds as a result of its preceding discussion, 
uniform convergence of the channel process, and finiteness 
of \T\. Similar to Property Q] in the case of the third or the 
forth method, we assume this property results from the well- 
structuredness of algorithm A. 

As the final step towards the main theorem, we define 
several random variables that are indirectly used in the theorem 
statement. Specifically, let ig be a geometric random variable 
with success probability S , where 

8' = (1 - g^H, 

where 6 is defined in Section IIV-BI In addition, let i v be a 
r.v. with the following distribution. 



P(i<p = 0) = Q 9 , 



and 



k > 1. 



P(i v = k) = (l- Qlp ) 2k - 1 (l-(l- Qlp ) 
We also define the random sequence {N 3 (i), i > 1} as^ 

Li (l<i< is) V 

(i — is + i v + 1) 

1 (i = i s + 1) A {i v = 1) 

2 (i = i s + 1) A (i v > 1) 



N 3 (i) 



min( ? | T 5,L 1 ) (i s + 2 < i < is + i v )A 



(V > 1) 



,0 i > is + i v + 1 

Using the above sequence, we define as 



E[EJ 



(l + jv^CO)] 



(8) 



which plays a key role in theorem statement and its proof. 
Note that for a fixed 6 > 0, we have 



lim i?oo = 
e v ^o 1 



Li 



As mentioned earlier, we can make g v and 6 V arbitrarily small 
by choosing a sufficiently large value for N c . We are now 
ready to state the theorem. 

C. Main Theorem on Stability of DCP 
We have the following theorem: 

Theorem 1: Consider a network as described in Section iHll 
For this network, let 9 be a constant defined by 



— FL 



inf 

x|| = 



(>(X) - a - 30 ¥ 
X(X) 



In addition, let Q x be 



(X) 



l|X||=i x(X) 

(a) If 69 v < a and 2a < inf ||x||=i <^(X), then the network 
is stable under DCP if the mean arrival rate vector, a, 
lies strictly inside the region 9T. 

7 Here, A and V are the and and or operators, respectively. 



s 



(b) For any input rate strictly inside 6^ T, there exist a 
sufficiently small value for a, and sufficiently large values 
for L\ and N c such that the network becomes stabilized 
under DCP. In other words, we can expand the sufficient 
stability region 8T arbitrarily close to by choosing 
appropriate values for for a, L\, and N c . 

(c) There exist instances of networks, as described in Sec- 
tion [III] for which their associated region OoqT is maxi- 
mally stable under DCP. 

Proof: The proof is provided in the Appendix. ■ 

D. Discussion 

1 ) Intuitive Explanation of 8: Theorem Q] states that all 
input rates interior to 8T can be stably supported under DCP. 
In particular, it implicitly quantifies 8 as a function of the 
sub-optimality of algorithm A and channel state correlation. 
Clearly, the value of 8 is not fixed, and can vary from a 
particular network setup to another. As expected, for a fixed X, 
as algorithm A finds better schedule vectors in shorter times, 
and as the channel states become more correlated, </>(X) gets 
closer to x(X), and 8 gets closer to one, expanding the region 
8T to the capacity region C. 

In addition, TheoremQ] shows how the stability region is di- 
rectly affected by the choices for a and L\, and the values for 
8<p and g v . The impact of a on 8 could be predicted by noting 
that the update rule uses Af in an update interval only when 
the normalized average backlog-rate product increases at least 

in the 



x(x) 



by a. Thus, we expect to see a decrease of the type 
stability region scaling. The effect of 8 V and g v is less obvious, 
but can be roughly explained as follows. Suppose at the fc t h 
round the optimal N\ is selected, i.e., N{{tk) — N\(tk). In 
this case, to have a proper comparison, ip r (tk) and ip(tk-i) 
should satisfy their corresponding inequalities in Property [2] 
Moreover, to make sure that N[(tk) or a near optimal N\ is 
used in the l t h round after the fc t h, we at least require ip r (ti) 
satisfy its corresponding inequality in Property |2] Therefore, 
there are at least three inequalities of the form in Property |2] 
that should be satisfied, which results in the term 38 v in the 
expression for 8. 

The factor i?^ in a sense measures the least fraction of time 
in update intervals where near optimal values for Ai is used. 
To better understand Roo, suppose g v is small, and the backlog 
vector is large. Once the optimal value for N\ is found in a 
round, as long as the inequalities in Property [2] hold for the 
subsequent rounds, Ni gets updated for only a few times. By 
the update rule, this means that A3 gets doubled in most of the 
rounds, and is likely equal to L%, Thus, the update intervals 



constitute 



Li 
1 + Li 



fraction of time. At the same time, in these 
intervals, near optimal values for Ai are being used. Thus, we 
expect to see 1 + 1 Li as a multiplicative factor in 8. 

The above discussion and Theorem Q] also state that DCP 
successfully adapts Ai in order to keep ip(tk + N c , Ni(tk)) 
close to 0(X(4 + A c )j| Note that for a given X find- 
ing Ai(X), or equivalently, ^(X), in general, is a difficult 
problem. Specifically, it requires the exact knowledge of the 
channel state and arrival process statistics, and the structure 

8 This statement is in fact a direct result of Lemma 4. 



of algorithm A. Even when this knowledge is available, as 
the number of users increases, finding N% (X) demands com- 
putation over a larger number of dimensions, which becomes 
exponentially complex. Hence, we see that DCP dynamically 
solves a difficult optimization problem, without requiring the 
knowledge of input rates or the structure of algorithm 4S 

2) Comparison with Static Policies, Minmax v.s. Maxmin: 
Part (b) of the theorem gives the region S^oT as the fundamen- 
tal lower-bound on the limiting performance of DCP. It also 
implicitly states that this lower-bound depends on the solution 
to a minmax problem. To see this, recall that by definition 
0(X) is the maximum of </>(X, N±) over all choices for Ni. 
Thus, we have that 



inf max 

||x||=i JVieA/i 



{X,AQ 
X(X) 



Now, consider a static policy that assumes a fixed value for 
Ni. This policy partitions the time axis into a set of frames 
each consisting of Ai timeslots, with the % frame starting at 
time (i — l)Ai. The static policy, in the beginning of each 
frame, e.g., the i tn frame, provides algorithm A with vectors 
X((i— l)Ai) and s((i— l)Ai). Algorithm A uses these vectors 
as input, and after spending Ai timeslots, returns a schedule 
vector as the output. This output vector is then used to schedule 
users in the next following frame. 

It is not difficult to show that the above static policy 
stabilizes the network for all rates interior to 8l r T, where 



iVi J 



7 A r l 



= inf 

11X11 = 1 



0(X,JVi) 
X(X) ' 



Thus, the best static policy, in terms of the region 8 s Ni T, is 
the one that maximizes 6 S N . Let 8f, be the maximum value. 



We have that 



max 



inf 

1X11=1 



<X 1 A L ) 

x(x) ■ 



Therefore, the best static policy corresponds to a maxmin 
problem. Considering the definition of 8^ and 8 S D , and that 
the minmax of a function is always larger than or equal to 
the maxmin, we have that 8 S T Q S^T. More generally, using 
the above definitions and a simple drift analysis, we can show 
that the stability region of static policies is not larger than the 
limiting stability region of DCP. 

3) Tightness of 8^ and 8 & : Note that parts (a) and (b) 
of the theorem do not exclude the possibility of networks 
being stable under DCP for rates outside of 8T or ^ooT. Part 
(c) of the theorem, on the other hand, compliments parts (a) 
and (b), and shows that for some networks the region d^T 
is indeed the largest scaled version of T that can be stably 
supported under DCP. This for instance may happen when the 
channel state is statistically symmetric with respect to users 
as the ones in Section [VI] Proof of part (c) of the theorem 
provides conditions for cases that lead to the maximal stability 
of the region ^ccT, and in particular, shows that the symmetric 
examples in Section [VI] meet such conditions. Note that the 

9 DCP also does not require the exact knowledge of channel state statistics. 
However, a practical implementation of DCP requires 7V C to be related to the 
convergence-rate of channel process to its steady state. 
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same discussion also applies to 9 s Ni and the stability region 
of static policies. We therefore have 6*00 and 9 S Q both as tight 
measures, stating that for some networks, including the ones 
in the next section, DCP can increase throughput efficiency of 
static policies by a factor of e °° gs e ° . 

4) Delay: Note that getting close to the boundary of flooT 
increases delay. This follows from part (b) of the theorem 
stating that for input rates close to the boundary, L\ and N c 
should be large. These choices, as expected, increase the length 
of test and update intervals, which can potentially be large 
intervals of sub-optimal transmissions in terms of the value 
used for N%, This in turn makes data wait in queues before 
transmission, thus increasing the delay. 

5) Distributed Implementation: Assuming algorithm A is 
decentralized |7|[5||6|[9|, DCP can be implemented in a 
distributed manner with low overhead. This is possible since 
consistent implementation of DCP in all nodes requires up- 
dates of only queue backlog and nodes' time-average of 
backlog-rate product, and such updates are needed only over 
long time intervals. 

More specifically, two conditions are required to be met 
for distributed implementation. First, nodes should generate 
the same sequence of random candidates for Ni over time, 
which can be met by assuming the same number generator 
is employed by all nodes. Second, nodes should have the 
knowledge of backlog-rate product in the test and its preceding 
update interval in order to individually and consistently apply 
the update rule. 

The second condition can also be met, for instance, by 
requiring each source node perform the following. Every node, 
e.g., the i± node, records its own backlog, Xi(t), only at 
the beginning of the test and update intervals. During these 
intervals, the node also computes its own individual time- 
average of backlog-rate product Xj-Dj. Here, we assume the 
time-averages in the test intervals are computed up to the 
last Nd timeslots, where Nd <C N c , Then, once an update 
interval ends, the ith node has all the duration of a test interval, 
consisting of N c timeslots, to send all the other nodes Xi and 
time-average of XiDi for that update interval. Similarly, when 
the last Nd timeslots in a test interval are reached, the z t h 
node starts sending all the other nodes Xi and time-average 
of XiDi of that test interval, hence, having Nd timeslots for 
communication. Since for each interval, data of each node, 
backlog in the beginning of the associated interval and the 
time-average, consists of at most a few bytes, we see that the 
overhead can be made arbitrarily small by choosing 7V C and Nd 
large. At the same time, we can make the ratio & sufficiently 
small, by choosing N c large, to ensure that not consideration 
of the last Nd timeslots in the test intervals has little impact 
on the stability region. 

VI. Case Studies 

In this section, we present two examples that provide further 
insight into our analytical results and the performance of DCP. 
To be able to compare the simulation results with analytical 
ones, we consider a small network consisting of two data flows 
in the downlink of a wireless LAN or a cellular network. In 
this case, s(t) is the vector of channel gains, and we assume 



the schedule vector is the power allocation vector, i.e., I = 
P = (pi,P2), with constraint 



Pi + P2 



Pt. 



where P t is total power budget. Assuming super-position 
coding is used in the downlink, if Si(t) < S2(t), then ll33l 



D 1 (s(t),P) = log( 



1 



Pi\si\ 



P2\Sl\ 



n 



and 



£>a(s(t),P) = log(l 



P2\S2\ 

n 



If si(t) > S2(t), we obtain similar expressions for user rates 
by swapping the role of one user for another. 

For illustration purposes, we assume that algorithm A in 
every step, i.e., during each timeslot, reduces the gap to the 
optimal backlog-rate product. Specifically, if the initial gap 
corresponding to the initial power vector V-°\ assumed to be 
chosen randomly, is Ao, then after i steps the gap is decreased 
to Aj, where 

Ai = XD*(X,s)-XD(s,I w ) 

= l(XD*(X )S )-XD(s,l(°)))=^, 

where (3 > 1. This case corresponds to g(n) = (1 — £*) with 
£ = 4, where g(n) is introduced in Section HV-AI 

Having specified rates and algorithm A, as the first example, 
we assume that the channel state is Markovian with two 
possible state vectors, namely, s% = (1,5) and S2 = (5,1), 
where the channel vector in each transition takes a different 
state with probability p t = 0.3. For this case, we set a — 0.06, 
N c = 12000, Lt = 32, (3 = 1.7, M = {N x : 1 < N t < 6}, 
no = 10, and p t — 50. To study the stability region, we 
consider the rate vector a = (2.4181,2.4181) which belongs 
to the boundary of T corresponding to this example. We then 
assume the arrival vector is 7a, where 7 is the load factor, 
and varies from 0.84 to 0.92. Fig. |2] depicts the resulting 
average queue sizes. For loads larger than 0.93, the queue sizes 
increase with time implying network instability. The range 
selected for 7 is motivated by noting that 9^ = 0.9447, which 
is computed numerically. Considering the growth of average 
queue sizes in Fig. we therefore see that for this example 
6*00 is indeed an upper bound for capacity region scaling. In 
fact, part (c) of Theorem Q] applies to this example, and any 
rate of the form (9^ +e)a, e > 0, makes the network unstable. 

As for the second example, we increase the number of states 
to six corresponding to the following state vectors: 

si = (l,5), s 2 = (5, 1), 
S3 = (1,2), 84 = (2,1), 
s 5 = (2,5), s 6 = (5,2), 

and having the following symmetric transition matrix: 
/ 0.3 0.1 0.2 0.1 0.2 0.1 \ 



0.1 0.3 0.1 0.2 0.1 0.2 



\ 0.1 0.2 0.1 0.2 0.1 0.3 / 



(9) 
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Fig. 2. Average queue size as a function of load factor. 
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Fig. 3. Comparison of capacity region scaling for DCP and static policies. 

For this case, we keep the same N c , L\, and Af\, but assume 
a = 0.02, [3 = 1.5, n = 50, and p t = 10. Similar to 
the previous example, to vary arrival rate vector, we consider 
the rate vector a = (0.6952, 0.6952) which belongs to the 
boundary of T associated with this example. Then, the arrival 
vector is assumed to be 7a, where the load factor 7 varies from 
0.67 to 0.76. The resulting average queue sizes are also shown 
in Fig. [2] In this case, for load factors larger than 0.76, the 
queue sizes increase with time, suggesting network instability. 
This result is consistent with our analytical results since the 
numerically computed value of 8^ is 0.7762. Note that part (c) 
of Theorem Q] also applies to this example, and any rate of the 
form (6>oo + e)a, e > 0, makes the network unstable. 

Finally, in Fig. [3] for the two examples, we have shown 
6 s Ni as a function of N\, and also shown the value of 
for DCP. As expected and the figure suggests, since DCP 
adapts Ni according to queue backlog, it outperforms the 
best static policy. We also see that the optimal stationary 
policy for the first example is the one with Ni — 3 and 
s o = 0.9122, and for the second example is the one with 
Nx = 2 and 9 s a = 0.7511. Note that characterization of the best 
static policy requires computation of </>(X), which, as briefly 
discussed in Section [V-D II can be computationally intensive. 
From the figure, we also observe that the performance of a 
suboptimal static policy can be substantially less than DCP if 
the static policy does not assume a proper value for N\. 

VII. Conclusion 

In this paper, to improve the stable throughput region in 
practical network setups, we have considered the problem of 
scheduling in time-varying networks from a new perspective. 
Specifically, in contrast to previous research which assumes 
the search-time to find schedule vectors is negligible, we 



have considered this time, based on which we modeled the 
time-efficiency of sub-optimal algorithms. Inspired by this 
modeling, we have proposed a dynamic control policy that 
dynamically but in a large time-scale tunes the time given to 
an available sub-optimal algorithm according to queue backlog 
and channel correlation. Remarkably, this policy does not 
require knowledge of input rates or the structure of available 
sub-optimal algorithms, nor it requires exact statistics of the 
channel process. We have shown that this policy can be 
implemented in a distributed manner with low overhead. In 
addition, we have analyzed the throughput stability region 
of the proposed policy and shown that its throughput region 
is at least as large as the one for any other, including the 
optimal, static policy. We believe that study and design of 
similar policies opens a new dimension in the design of 
scheduling policies, and in parallel to the efforts to improve 
the performance of sub-optimal algorithms, can help boost the 
throughput performance to the capacity limit. 

Appendix 
Proof of TheoremQ] 

Proof of part (a): 

The proof of part (a) consists of two main parts. First, using 
several lemmas, we obtain a negative drift with a random 
number of steps. In the second part, we use the negative drift 
analysis to show that the return time to a bounded region 
has a finite expected value, and conforms to the properties 
required for network stability, according to the definition given 
in Section IV-A1I 

We start by noting that 6 < 1, and since a is strictly inside 
6T, there must be some non-negative constants (3 s j with the 
property that for all s 6 S 



lex 



&,i < < 1, 



such that 



^7r(s)^/3 S!l D s ,i. 



(10) 



(ID 



ses lex 
Considering HOI , we can define positive £' as 



max > £L 
lex 



Since £' > 0, by the definition of 8, for ||Xt|| ^ 0, we have 
that 



RooMt) - a - 36 v ) 

x(x t ) 



maxy^/3 s 1 > £' > 0. 



(12) 



iex 



To proceed with the proof, associated with a given time t, 
we define a sequence of random variables {ri}^._ 1 , where 
r_j and r denote the number of timeslots to the last timeslot 
of the previous and the current scheduling round, respectively, 
and n, i > 1, is the number of timeslots to the last timeslot of 
the i t h subsequent scheduling round. Let Ti t denote the past 
history of the system up to and including time t. Thus, given 
Tit, the value of Xt is known. Let /(■) be defined as 

/(X) = ||X|| 2 , 
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Considering a tr- + 1-step drift with function /(•), we can 
write 

a(t k + i) = E[/(x i+TK+1 ) - f(x t )\n t ] 

TK 

= E[£/(X t+fc+1 )-/(X t+fc )|W t ] 

fc=0 

TK 

= E[y^(X t+ fc + i + X t+/ t)(X t+ fc + i — X t+/ t)|Ht]. 

Using the fact that arrivals and departures are bounded, after 
performing some preliminary steps, we can show that 



A(t a - + 1 
< E 



(t k + l)Ci + {tk + l?C 2 



+ 2^(X t A f+/ t — X t+ fcD t+/ t) \TLt 



k=0 



for appropriate constants C\ and C 2 . Since X f+/ tD t+ fc > 0, 
we have 



A(t k + 1) < E 



(t k + l)d + (t k + l?C 2 

tk 

+ 2^(X t A f+fc -X t a) 

fc=0 
tk 

+ 2^(X t a-X f+fe D* +fc ) 

fc=0 
tk 

+ 2 X! X t+fcDt l -fc 



where D 



t+k 



k=0 
tk 

— 2 X t+ fcD t+/ t |Wi 

fc=T + l 

D*(X(t + fc), s(t + k)). In the following, we 
derive an upper bound for A(tk + 1). 

As mentioned in Section UlI-AI arrivals are i.i.d with mean 
vector a. We can therefore apply the same method used to 
prove Lemma [TJ to obtain 

tk 

E[||£)At + fc-(7*- + l)a|| \H t ] <e-E[(T K + l)\n t ], 

k=0 

where e > 0, and can be made arbitrarily small by choosing 
a sufficiently large K. 

Using the above inequality, Lemma [2] Lemma [3] and 
Lemma 21 all with the same choice for e, we can show that 

A(7* + l)<E[(7tt + l)||X t ||x(X t ) 

Roo(4>(t)~a~W v ) 



€i + 2( max > /3 S i 
lei 



x(x*) 



\Ht 



where 



ei 



C x , C 2 (r K + l) 



8e 



(13) 



(14) 



x(X t )V||X t || ||X t || 

Note that according to the lemmas, e can take any given 
positive real number if K and ||X f || are sufficiently large. 



Similarly, e x can assume any given positive value. To see 
this, first note that since a £ 9T, we have a G V. Thus, for 
any user, e.g. the user, for which a, > 0, there has to be a 
state s and a schedule I satisfying 

7r(s)D(s,I) 2 > 0, 

where D(s, 1)^ is the i t h element of vector D(s, I). Otherwise, 
&i should be zero, contradicting the assumption. Therefore, 
assuming a ^ 0, we can define positive v as 

v = minmax7r(s)D(s, I)i > 0. 

ieN s ,i v ' K ' ' 



Thus, 



E[X t D*(X t ,s)] > «maxX(t)i > — = 1 1 | 



This implies that for all nonzero X € R 

v 

Jn' 



N 



x(x)> 



(15) 



(16) 



On the other hand, since departure rates are bounded above 
by D max , we have 

x(x) < Vnd (17) 

Now consider any positive e 2 , and suppose K is sufficiently 
large such that for large ||X t || we have 



8c 



x(x t 



< 



Ne C2 
v < 3 ' 



where the first inequality follows from (TT6b . This upper-bounds 
the third term in t\. Since for any K, and in particular, the 
chosen one, we have tr- + 1 < (K + 1)(1 + Li)N c , we see 
that if ||X t || is appropriately large, the first and second terms 
in ei can also be less than Thus, for any given positive e 2 , 
we can find an appropriately large K such that for sufficiently 
large ||X t ||, (fT3l l holds with ei < e 2 . 

Suppose K is sufficiently large, and ||X f || > Mr f° r 
appropriately large Mk such that ei < We can use ( fT3T > 
and dT2t to show that 

A(rjf + 1) < -E[e'||X t ||(rx + l)x(X t ) \H t ]. 

This inequality and ( [Tol l further imply that 



A(t k + 1) <-E ^rjf + l)||X t |||W t 



(18) 



where ^ = --}=£,' > 0. We, therefore, have obtained the 
negative drift expression, completing the first part of the proof. 

Note that in above tk is a random variable, and in fact, is 
a stopping time with respect to the filtration H = {7it}f^o- 
This means that we have obtained a drift expression that is 
based on a random number of steps. Proofs of stability in the 
literature, however, are often based on a negative drift with a 
fixed number of steps. This contrast has motivated us to adopt 
an interesting method recently developed in 1 10 1. This method 
is general since it can be applied in both cases, and also 
leads to an intuitive notation of stability. However, it has been 
originally developed for Markov chains. Therefore, as well 
as using less technical notations, in what follows, we apply 



12 



minor modifications to the method so that it is appropriate in 
our context. 

We now, in the second part of the proof, use the negative 
drift, and prove that the expected value of the return time to 
some bounded region is finite in a manner that renders network 
stable. Let C denote the bounded region, and be defined as 

C = {X e R N ,\\X\\ < M K }. 

Associated with C, we define ac to be the number timeslots 
after which the process {X t+ i}°^ enters C, i.e., 

a c = inf{i > : X t+i e C}. 

Similarly, we let tq be 

r c = inf{« > 1 : X t+4 S C}. 

Therefore, tq, in contrast ac, characterizes the first time that 
the process {X t +i}°^L 1 returns to C. 

Back to the drift expression in (fTSt , let 77 be a random 
variable defined by 

V = ar K + l)\\X t \\. 
We obtain, for K sufficiently large, 

E[/(X t+TK+1 )+^|W t ]</(X 4 ), (19) 

provided that 1 1 X t | > Mjf. Let i] = rj, and tk,o = t k, 
where 77 and tk are random variables defined by considering 
time t. We now consider time — t + tk.o + 1. For this 
particular time, we can define another pair tk,i and r)i and 
such that if ||X ( i)|| > M K , then 

nf{^ +TKA+l ) + m\n t ^]< f(^l 

where tk,i is the number of timeslots from time to the 
last timeslot of the subsequent scheduling round, and 

Th=Z(T Kll + l)\\X tW \\. 

Note that the definition of tk,\ and r\\ is independent of 
whether the previous inequality holds. 

We can continue this process by considering the drift criteria 
for time t$ — t^~^ + tk,%-\ + L and defining random 
variables tk,% and rji. The random variables tk,i and rji have 
a similar definition as tk 1 and rji , respectively, except that 

' (i) 

they are associated with time t K . Using these definitions, we 
can define more precisely by 

i-l 

i« = + (r^-1 +l) = t + £(7*J + 1). 

3=0 

(i) (i) 

Note that t K is a stopping time with respect to Ti. Using t y K ', 
we set 

Xi = X. (i) , * > 0, (20) 
and define Ti T as the filtration given by Ti T = {H .(i)}£n- ^ n 

K 

addition, associated with rji, which is given by 
m = £,( T K,i + 1)||X -co ]] , 



we define rf*> as 

i-l 

r,(°)=0, V {i) =^2vj- ( 21 ) 
3=0 

We also define v as 

v = mf{i >0:t { ^ >t + a c }, (22) 

which is a stopping time with respect to TC T . Intuitively, 
v marks the first time t$ at or before which the process 
{X f+ i}^ enters C. We finish the chain of definitions by 
introducing the sequence {Zi}^L Q , where 

Zi = f(Xi) + ri®. (23) 

For i < v, using (f2TT >. we have 

E[Z i+1 |W. w ] - E[/(X i+1 ) + m\Kw] 

K K 

< /(Xi) + r?« = Z, (24) 

where the first equality follows from the fact that 77W is 
completely determined given Tiji), and the inequality is 
simply an immediate result of ( fT9l and the assumption i < v. 
To simplify the notation, let v A i denote 

v A i = min(i/, i). 

It now follows directly from (l24l i that the sequence {Z^;}^ 
is an 7i r -supermartingale. Since /(•) is non-negative, we have 

nv { " M) \n t ] <E[z vM \H t }. 

But Tit = W.to), and {Z uAi }^ is a supermartingale. Hence, 

V[Z„ M \H t } = E[Z„ Ai |W (o)] 

< Z = /(X t ). 

Considering the last two inequalities, we obtain 

E[r/^>|W t ] </(X t ). (25) 

In addition, using the definition of jyW and 77^ while assuming 
Mr- > 1, it is easy to see that 

i-l i-l 

3=0 3=0 

= Z(t { K M) -t). (26) 

Applying the monotone convergence theorem [29|, we can 
take the limit in (EBT l and d26*l l as i — > 00 yielding 

E^-tiw^rVcxi). 

But by definition in (|22j, ct c < - f. Thus, for X t i C 

n<rc\H t ] <rV(X t ). 
If X t G C, we have ctc = 0. Hence, we have that 

no-cm < rVCX^lx^c, 

showing that the expected ctc is bounded by a function of Xt 
uniformly in the past history and t, as required. This completes 
the proof of part (a) of the theorem. ■ 
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Proof of part (b): Part (b) follows directly from part (a) 
of the theorem as a corollary by noting that 9 V and q v can be 
made arbitrarily small by assuming a sufficiently large N c , as 
stated in Property [2] This allows us to select arbitrarily small 
values for a. In addition, we can chose a sufficiently large 
value for L\ such that for sufficiently small values for 6 V and 
Qip, Roo is arbitrarily close to one. Considering these choices, 
we see that we can make 9 arbitrarily close to 9oo, as required. 

■ 

Proof of part (c): Since part (c) of the theorem only 
concerns existence of such networks for which the region 9ocF 
is maximally stable under DCP, for simplicity of exposition, 
we consider a network consisting of two users, i.e., two data 
flows. Note that our approach can be extended to more general 
networks with N data flows. Here, we adopt a direct method 
and show that with positive probability norm of the backlog 
vector approaches infinity. Therefore, the expected value of the 
return time to any bounded region becomes infinity, implying 
network instability. We start by introducing several definitions 
followed by four conditions sufficient for network instability. 
Let D and D* be defined by 



D(X) 



and 



E 



1 



n-1 

[ 

lim 

i=0 



u 



t+l) 



X 



t+i 



X,« > 



D*(X)=E s [D*(X,s)], 
where D*(X, s) is defined in (fJJ- In addition, let X m j„ beF"l 

X mm = argmt . 

I|x||=i XW 

Note that in the definition of D, we hypothetically assume that 
the backlog vector after time t is fixed and does not change. 
This is similar to the method used to define </>(X, A^) except 
that here we do not assume a fixed value for Nx, and instead, 
assume DCP adapts Ni as if the backlog vector was changing. 
In addition, note that by the ergodicity of the channel process 
D does not depend on t, and moreover, by Property Q] D does 
not depend on ||X||. To simplify the subsequent analysis, we 
also consider the following definitions: 

Definition 1: For a given X and a given e > 0, the e- 
neighborhod of X is defined by 

AA(X,e) = {X 1 :||X 1 -X|| < e}. 

Definition 2: For a given X with ||X|| = 1, and a given 
e > 0, the normalizing region 7?.(X, e) is defined by 

X a 



7e(X,e) = {X 1 : M ^0, 



|Xi| 



X 



Definition 3: Consider a region 1Z and a vector X inside 1Z. 
We define £(X, TV) as the supremum of the angular deviation 
of the vectors in 1Z from X, i.e., 

/ XY 

£(X, Tl) = sup arccos ( 

-Yen v || Y || 

'"Note that here infimum can be achieved since the functions 0(X) and 
x(X) are continuous functions of X, and the infimum is taken over a closed 
interval. 



Now suppose the following conditions hold: 

CI) X TOin = 7iD(X TOin ) = 72D* (X m in), for some 
71,72 > 0. 

C2) For any N 1A e Mi and N xa € Mi with N\ t \ ^ iVi, 2 , 
we have <f>(X min , Ni A ) ^ <fi(X min , iVi j2 ). 

C3) For any fa > and fa > 0, there exists a sufficiently 
small e > such that if X G lZ(X. min , e), then 

D(X) — D(X,„j„) = A!D(X m j ri ) + X 2 ( p^jj — X min ), 

for some Ai and A 2 satisfying |Ai| < fli and < A 2 < 

fa 

C4) For any XeB", for some t 

P(X t = X) > 0. 

Condition CI may be met by assuming a statistically sym- 
metric channel states as the ones in Section [VI] Condition C2 
simply requires the function 0(X m ;„, iVi) to be a one-to-one 
function of A^i at X m ;„. Condition C3 intuitively states that 
the average departure rates should be a continuous function^ 
of X around X m i„, and in particular, when X deviates from 
X m i n , these rates should deviate from D(X m i„) in a similar 
manner. This is in fact expected as increasing the backlog 
vector in one dimension should increase the expected departure 
rate in that dimension, which can be considered as a result of 
the approximation to the GMWM problem through the use of 
algorithm A. Note that in C3 where appropriate the vector X is 
normalized by its norm since D(X) does not depend on ||X||. 
Finally, C4 simply requires the process {X 4 } to be able to 
reach all vectors in IN , although what we need for the proof 
is a relaxed version of this assumption. Using the numerical 
results for </>(X m i„, TVi) and D*(X m i„), and the symmetry 
of channel states, it is easy to verify that the conditions C2- 
C4 also hold for the examples in Section [VI] Therefore, there 
are examples for which the conditions C1-C4 hold. Next, we 
show that these conditions are sufficient for network instability, 
completing the proof of part (c). 

First, note that D*(X m i„) 6 T, which directly follows 
from the definition of D*(X TOJ „) and V. Second, the rate 
D*(X TOJ ;„) belongs to the boundary of V, otherwise we 
could find another vector D inside V and within a small 
neighborhood of D*(X m j n ) with larger backlog-rate product, 
in contradiction with the definition of D*(X m i„). Hence, we 
see that the rate f? 00 D*(X m i„) belongs to the boundary of 
9ooT. Third, we can see that by the definition of 9^ and X m i n 



<e}U{X! =0}. 



X(X7nm) 



> 



X m i n D (X m ^ n ) 

X(X/nm) 



This is because DCP may use sub-optimal values for Ni, 
which by C2 make X m j„D less than 0(X m j„) when N c is 
large. Using CI and the above inequality, we have 



> 



||D(X m<n )|| 
|D*(X mm )|| 



"As opposed to traditional definitions which usually use A^(X, e) to define 
continuity, here, the region 7?.(X, e) is used to characterize continuity. 
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,0) 



Me 2 M X 1 
^(X mm , e) = Ai U A 2 U -4 3 U A* 
Kl = ^ 2 U^ 3 U .A4 
^ 2 = -4 4 

Fig. 4. Illustration of regions 7?.(X m j n , e), Tii, and 7^2. 



which implies that 

D(X mm ) < doo D*(X), 

where the inequality is component- wise. Without of loss of 
generality, we assume that 



D(X m j„) — Ooo D*(X). 
Let the input rate be 

a=(ff oo +0D*(X), 



(27) 



(28) 



for some ? > 0, which is clearly outside of the region f^ooT. 
Let Ai „ be the drift vector defined by 



n-l 



A*,„ = - V A t+J ; - - V(Df +J ; - U t+i ). 



In addition, let 



A x = a-D(X). 



Note that Ax does not depend on ||X|| since D(X) has the 
same property as pointed out earlier. Suppose for a given 
ex, the values for fix and f3 2 are chosen such that by C3 
if X t G 7£(X m j n ,e), for appropriately small e, then the 
following holds 

||A Xt - A Xmi „|| < e x . 

Using Assumption [1] condition C2, channel ergodicity as 
stated in Section IIII-BI and that arrivals are i.i.d, it is not 
hard to see tha{3 when e is sufficiently small, for any positive 
e 2 and < ( < 1, we can first chose n large and then M e2 
sufficiently large, and define the region 72x as 

Tlx = {X : ||X|| > M 62) X e 7e(X mi „,e)} 

such that 

P(||A t ,„ - A x , || < e 2 \H u X t e Tlx) > (1 - C). ( 29 ) 
where in above 



— Ai^oo) . / X t v 

a _ A 2 ( i^rr ^ X TOl „ J . (30) 



Poo + <r " v || X t 

12 A similar discussion similar to the one for Property f2] applies here 



Boundary of 72-2 



•eg-; (-»-!■ 

----^2 X 2 ^..j 



X = MX, 



Boundary of 72-2 




^V / 2(/l J 7 ta x "I - Dmax) 



Fig. 5. Examples where Xt_|_ n j 6 7^2 explaining cases where At+ n i,n = 
as in the points X, Xx, and X2, and the cases where At+ n i,n = 1 as in 
the point X3. In this figure, the region IZ2 is rotated clockwise. 

The above equality is obtained by using condition C3, equality 
(|27| |. and considering that the input rate is given by d28l i. In 
particular, we have that 



A 



Since e 2 can be made arbitrarily small by choosing sufficiently 
large n and M t2 , we assume that for all X S Tlx 



e(Ax,AA(A x ,e 2 )) < 



(3D 



Hence, according to ( f29b and OTb . for X t 6 72x, with proba- 
bility larger than (1 — () the drift A t „ is close to Ax, with 
a supremum angular deviation that is half of the supremum 
angular deviation of X's in 72.(X m j n , e) from X mi „. 
To continue, let the region 72.2 be defined as 



K 2 = {X : X - MX mi „ e 72(x„ 



0>, 



(32) 



for some M > M e2 . This region is a shifted version of 
7l(X m i n , e) with the origin shifted to MX m j n , and therefore, 
72.2 C Tlx - Fig.|4]provides a graphical demonstration of regions 
72.(X m j n ,e), 72.x 1 an d 72,2- In the figure, the vector X m j„ is 
shown by a unit arrow-vector. Now we are in a position to 
show that starting at X t = MX m j n , for some appropriately 
chosen M, with positive probability {X t+i , z > 0} stays in 
72.2 with ever growing norm. 

Consider the sequence {X t+n j}°^ with X t = A/X min . 
Recall that n is chosen sufficiently large according to the value 
of e 2 . Let At+ni,n be a r.v. defined by 



A 



■t-\-ni,n 



1 if ||A t+ni>n - A Xt+ni || < e 2l 
otherwise 



Provided that X t+ „i 6 72 2 , where 72-2 C 72x, and assuming 
a small ex and a sufficiently large M, it is not hard to see 
that if At+ni,n = 1, then the following hold as a result of 
( f30b and d3~Tl i. First, X t+n ( i+1 ) 6 72-2. Second, the distance of 
the vector X t+ „( i+1 ) from the boundary of 72 2 becomes the 
distance of X t+ „i plus at least tiSa- Third, 



l|X t+n ( i+ l)|| > ll X 



£+m I 



where 6a is an appropriately small positive constant. Fig. [5] 
shows the region 72.2 rotated clockwise, and provides examples 
for the case where At+ n i,n = 1- Specifically, when X t+ „i 
equals one of the points X, Xx, and X2, the figure assumes the 



15 



drift vector A t + n i. n is within the €2 -neighborhood of Ax 1+ni . 
For points Xi and X2, the figure also shows the increases in 
their distance from the boundary of IZ2, and denotes them by 
d\ and da , respectively. These values, as mentioned above, 
are lower-bounded by uSa as a result of (f30b and OTb . To 
see this note that, as shown in the figure and suggested by 
( l30l >, when It-t+ni deviation from "K m % n , i.e., when it deviates 
from the central line in the figure, the vector Ax 1+ „, gets a 
component towards the central line. This and the assumption 
that the angular deviations in the 62 -neighborhoods are less 
than half of the one defining region H2, as assumed in Pit , 
ensure that after n steps the backlog vector remains in IZ2, 
and that the distance from the boundary of 72 2 increases when 
At+ni.n = 1- Using a similar argument, it is easy to see that 
when ei is small, an event of the type At+ni,n = 1 increases 
the norm of backlog vector more than uSa- On the other hand, 
if At+ni,n = with at most probability £, both the distance 

C (+n ( J+1 ) ||, compared to the 
respectively, decrease at most 
by ny/2(A max +D max ). In Fig. [5] the point X3 is an example 
of this case, where the vector /\ t +ni,n can be anywhere inside 
the outer circle, centered at X3, but outside the inner circle 
defining the €2 -neighborhood of the vector X3 + nAx 3 . 

In the rest of the proof, as the worst case, we assume that 
for X 4+ni e 1Z 2 , i > 0, the event {A t + n i.n — 1} occurs with 
probability (1 — £). Note that IZ2 C IZi, and when X t+ „i E 
IZ2, the inequality ( |29l > holds regardless of the past history 
Tit+ni- Let the event that At+ni,n = 1 be a success. Based 
on the previous assumption, for 'X.t+ni € 72.2, this success 
event occurs with probability (1 — £) regardless of the past. 
Now consider the sequence {X f+ „i}, < i < m — 1, and let 
rn(i_£) be the number successes of the type {At+ n i,n = 1} 
out of the m associated trials. The above observations imply 
that if H-t+ni G for < i < m — 1, and if 



of X t+n(i+1) from TZ 2 and 
distance of X t+ni and ||X t+ni 



TO 



l (i-C) 



then X 



t-\-nm 



G TZo, and 



II > ll X t|l + ™(1-C) n< ^4 

- (to - m(i-Q)nV2(A max + D max ). 

Using the above, we see that a sufficient condition for the 
sequence {X t+ „ m , to > 1} to stay within 72.2, and 

|| > ||X t || +n m e 3 (6 A + V2{A 

max H~ ^mai)) 5 

(33) 

for some £3 with 

6 a 

£3 < 



V2(A 

-max 

+ D 
1, 



is that for all to > 1, 



TO (i-C) 



< 



OA 



\ // ^{A max + Dmax) + Sa 



£3- 



(34) 



In what follows, we show that with positive probability the 
above inequality holds for all m > 1. 



Starting at MX min , let t-r, 2 be the first time that the ratio 
r m does not satisfy d34b . Consider the sequence 



{A, 



t-\-nm,n -> 



0<m<T U2 - 2}. 



(35) 



For < to < Tn 2 — 1, the discussion leading to d33l and 
< f34b implies that X t+Jim 6 72-2- Furthermore, this discussion 
shows that the sequence can be considered as a truncated 
Bernoulli process with success probability (1 — £). An intuitive 
yet important observation is that for an infinite sequence of 
Bernoulli trials {£?;,i > 0} with success probability (1 — £), 
for any given e 4 > 0, with positive probability the ratio 
of failures never reaches C, + £4. This is the key to prove 
tji 2 = 00, or equivalently, (l34l holds for all m > 1, with 
positive probability. Let the notation r m be re-used as the 
failure ratio for the infinite Bernoulli process, i.e., 

1 m 

r m = l--Y j B i . 

TO *■ — ' 

i=l 

Using large deviation results [34], we have 



where 



P(r m - C > e 4 ) < p" 



inf M z As) < 1, 

s>0 C 



(36) 



where = 1 — B\ — £ — €4, and Mz ( (s) is the characteristic 
function of Zq. The above inequality indicates that with 
probability at least (1 — p m ), the ratio of failures after m 
trials is less than or equal to C + £ 4- 

To further study r m , we consider the infinite Bernoulli 
process in a sequence of stages. In the first and second stages, 
we consider m Bernoulli trials. However, after the second 
stage, for the i t h stage, we consider the next subsequent 2'~ 2 m 
trials. Since trials are independent, with probability (1 — £) m , 
we can have only successes for the first to trials, and thus, the 
ratio r m never goes beyond zero, i.e., 

Tj =0, 1 < j < TO. 

For the second stage with the next m trials, using 
see that with probability at least (1 - C)" l (l - P m ) 

+ to(C + e 4 ) 



we 



max rj < 

0<j<2m J to + m(Q + £4) 



<2(C + £4), 



where the first inequality refers to the worst case where in the 
second stage of m trials, the failures happen in the beginning 
of the stage, i.e., when the (m + l) t h, (m + 2)±,..., and (to + 
TO (C + £3))* trials are all failures. Inductively, considering 
the (I + 2) t h stage, we see that with probability at least (1 

O m K=o 0--P 



2 p rn\ 



l<j<2< ! + 1 )' 



J ~ 2 i TO + 2 i TO(C + £ 4 ) [ 



£4 



(37) 



where the numerator is the total number of failures up to the 
end of (I + 2) t h stage, and the denominator corresponds to the 
worst case where the failures in the (I + 2) t h stage all occur in 
the beginning of the stage. Therefore, with probability at least 

Pc = (i-crn- (i- P 2Pm ) 
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the ratio r m , m > 1, always stays below 2(£ + £4). But 

Pc >(i-crn- (i-( /9 ™r 1 ). 

This and Lemma indicate that 

PC > 0. 

The above discussion implies that with a positive probabil- 
ity, not less than p^, the ratio r m associated with the sequence 
in d35l l stays below 2(£ + 64). Hence, if £ and 64 are chosen 
such that 



C + £4 < - 



5 A 



£3 



(38) 



ma x + D max ) + 8 a 

then starting at X f = MX. m ; n , the inequality in (f34b holds for 
all to > 1 with positive probability. Since this latter statement 
can be generalized to the case where X* G IZ2, we have that 

P(Vm > 0, Xt+nm e K 2 and (US holds |X t e ft 2 ) > 

(39) 

if (l38l holds. But (l38l can be satisfied since the choice for a 
positive £4 is arbitrary, and as mentioned in the discussion 
leading to d29l . £ can be chosen arbitrarily small. Hence, 
for an appropriate choice of parameters, d39l holds, which 
suggests that with positive probability X i+ „ m stays in IZ2, 
and its norm increases (at least) linearly with to. Since by 
C4 with positive probability X t € IZ2 for some t, and 
arrivals and departures are bounded implying for < j < n, 
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t-\-mn—j I 



> llx 



t-\-nm I 



C, for some C > 0, we see that 



indicates that with positive probability 

lim ||X t+ j|| = 00. 

i — >oo 

This shows that when the input rate is given by d28l . and thus, 
when it is outside the region O^T, for any bounded region C, 
with positive probability the process X f+ ; never returns to 
C, and hence, E[rc] = 00, implying network instability. This 
completes the proof of part (c) of the theorem. 



Appendix 
Lemmas 

Lemma 1: For any e > 0, regardless of the past history Ht, 
there exists a sufficiently large K e such that for all s € S and 

K > K e 



E[(r K + l)vr(s) 



1 s(* + fc) = 

k=0 



< eE 



(t k + l)\H t 



Proof: Since rj+i — r< > 2iV c > it is easy to verify 
that tk — * 00, a.s., as i'C — > 00. This almost surely 
convergence and the ergodicity of channel process, as stated 
in Section IIII-BI imply that as K — > 00 



1 TA ' 



i(i+fe)= 



7r(s), a.s. 



(40) 



Moreover, since the channel convergence in Section IIII-BI is 
uniform in the past history and t, and since the number of 
channel states is finite, we see that the above convergence is 
uniform in t, Ht, and s. Thus, for any e > and £ > 0, 



we can find a sufficiently large K e > ^ independent of the past 
history fit and s such that ll29l 

1 TK 

P{ sup \tt(s) — - Vl s ( t+fc )=s| > e [Ht) < C- 

K > K *'., TK + 1 t^o 

(41) 

Given let A K l e > denote the set of all uj £ il with the 
property that 

1 TK 
sup |tt(s) — V 1 

By fiTt . we have that 

p(^, c y|H0<C 

Suppose if > i^ e ' ^ and let 



(t+k)= 



> e . 



(42) 



A = E 



(t k + l)vr(s) - £ 1, 



(t+fc)= 



fc=0 



Using conditional expectations and the definition of ; / , 

and considering the fact that < 7r(s) < 1 and tk > 0, we 
can show that 

A < P(uj A K , 6 ')E[e'(rK + 1) |w £ -A*, 
+ P(we^ JCe , e y)E[(TK- + i) |we^,^,Wt] (43) 

Similarly, we obtain 

E[(r K + l)|H t ] 

= P(w £ ^, c/ )E[(tk + 1) |w £ ^ , ^y.Wt] 

Since t#- > 0, the above implies that 

P(lu i A Kj { , e ')E[(TK + 1) |w £ cie /,W t ] 
<E[(TK- + l)|W t ] (44) 

In addition, w.p.l, t k + 1 < (-R'+l)(l-|-L 1 )iV c . It thus follows 
from (1421 . ( |43l , and (03 that 

A < e'E[(r K + l)|7if] + + 1)(1 + L X )N 
Noting the fact that tk > 2KN C , we obtain 



A<E[(tk + 1)|H 4 ] (e'+C 



< E[(nc + 1)| Wt] (e'+C 



.(.«• + l)(l + £i)iV e 
E[(tk: + l)|Wt] 

.(.«• + l)(l + £i)iVc 
2KiV r + 1 



eE[{T K + l)\Ht], 



where 



(g + l)(l + Li)JV e 
2^^ + 1 



can be made arbitrarily small by choosing sufficiently small 
values for e and (. A similar discussion holds for —A with 
the same e, completing the proof. ■ 
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Lemma 2: For any given e > 0, there exists a sufficiently 
large constant K e > such that for all K > K e , we can find 
a proper M e< K such that if ||X t || > M e< K, the following holds 



E[^X t a-X i+fc D* +fc |W t ] 

fc=0 

< e[(tk + l)||X t || (e - (1 - max^ /3 sJ ) X (X t )) \H t 



lex 



Proof: To prove the lemma, we first note that by the 
definition of D*(X t ,s), and the assumption that departures 
are bounded by D max , we have 

X t+fc D* +fc = maxX t+fc D(s t+fe ,I) 



lex 



fe-i 



> maxX t D(s t+fc ,I) - max ^^D t+i D(s t+fe ,r 

i=0 

>maxX t B(s t+k ,I)-kND 2 max . (45) 

Using ( fTTl i. we also observe that 

X t a-E s [X f D*(X t ,s)] 
= Xt X] »r(s) ^ ft,iD(s, I) - £ 7r(s)X 4 D* (X t , s) 



se5 lex 



se5 



ses 



lex 



]T tt(8) ( J] flM (X t D(s, I) - X f D* (X t , s; 

- ((i-EM x ' D '( x '. s ) 



(46) 



lex 

Since by definition for all I E X 

X t D*(X t ,s)>X t D(s,I), 

we have 

X t a-E 8 [X t D*(X t ,s)] 

< -^7r( S )(l-^/3s,i)XtD*(Xt,s) 

ses iex 

< - (1 - max J2 ^( S ) X * D * ( X *> s) 

s iex se5 

= -||Xt||(l-max^/3 s , I ) X (Xt), 



(47) 



iex 



where the last equality follows from the definition of x(X t ). 
Back to the inequality in the lemma, using d431 l, we have 

E[^X t a- X t+k T>* t+k \H t ] < E[(r K + if N D 2 max \H t ] 



k=0 



E 



[ J (X t a - X t Y ls(t+fc)=sD*(Xt, s)) \H t 



fc=0 



se5 



= E[(T K + l) 2 ND 2 max \H t 



E 



^X f a-X 4 £D*(X t ,s)£l 

se5 fc=o 



Hi 



fc=0 



Using Lemma [T] for t\ > and sufficiently large K\, we have 
that for K > K x 



E[£x t a-Xt +fc D* +fc |Wt] 

fc=0 

< E[( TK + l) 2 ND 2 nax \U t ] + e[(tk + l)X t a 
-(r^ + l)Xt^D*(Xt,s)( 7 r(s)-e 1 ) \U t 



ses 



= E[(T K + l) 2 ND 2 max \H t ] 

+ e 1 E[(r K + l)||X t |||5|v / ]V J D Ht 



E 



[t k + 1) (x t a - E s [X t D*(X t , s)] ) \U t 



(48) 



Combining ( l47b and d48l i. we obtain the inequality in lemma 
with 

(T K + l)ND 2 nax |c|/ T7 n 

The choice for a positive e is arbitrary since one can first select 
K € > Ki such that for all K > K e , t\ is sufficiently small. 
After selecting K, because w.p.l t k + 1 < (K +l)(l+Lx)N c , 
one can chose M e k such that for ||X t || > M c k the first term 
in e is also sufficiently small, completing the proof. 

■ 

Lemma 3: For any given e > 0, there exists a sufficiently 
large constant K e > such that for all K > K e , we can find 
a proper M £t K such that if ||X t | > M €> k, the following holds 



E[£x t+l D t * +l |W t ] 



i=0 



< E 



(T A - + l)||X t ||( X (X t ) + e )|H t 



Proof: Using the definition of D*(X, s), for the LHS of 
the inequality in the lemma we can show that 

LHS = E [ Y (( X * + E( A '+^ - D '+^ + U '+^)) 



i=0 



3=0 



Since arrivals and departures are bounded by A 
respectively, we have that 



D(s t+i) I)J|W t 
and D„, 



LHS<E[£x t D*(X t ,s t+l ) \H t ] 

Y iND 



i=0 

TK 



7L max J - y max 



2 

max 



(49) 



i=0 



Let £ be the first term of the RHS of the above inequality. 
We have 



TK 

S = E iim XtD *( Xt ' S ) 1 ^) = S |^ 

i=o ses 

TK 

= X i ^D , (X,,8)E[^ !.(*+*)=. \Ht 



ses 



i=0 
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Using LemmaQ] for any positive ei, we can find a sufficiently 
large K% such that for K > K% 



E < X 4 ^D*(X f ,s)E[(r^ + 1)(tt(s) + e x ) \U t 
= E[(r K + l)^7r(s)X t D*(X t) s) 



= E 



+ ei|5|ViV||X t || J D ma:c (TK + l) \H t 
{t k + l)||X t || ( x (X t ) + eiVJV|5|D max ) |W, 



(50) 



where the last equality follows from the definition of x(X t ). 
Considering inequalities (|49] l and (15 Ot . we obtain 



LHS < E (tk + l)||X t ||( X (X t )+e 2 )|W < 



(51) 



where 

e 2 = e 1 |S|v / iV£> r , 



+ (r K + 1)7V 



+ D 2 



IX, 



To complete the proof, it remains to show that e 2 can be made 
arbitrarily small. Consider any positive e. We first choose K € 
such that for K > K e the value of e\ is sufficiently small 
to make the first term in e 2 less than |. Since Tfe + 1 < 
(if + 1)(1 + Li)N c , we see that for a given if with K > K e 
if ||Xt|| > M €} k for a sufficiently large M t j<, then the second 
term in e 2 can also be less than |. Therefore, for any positive 
e, if K > K f and ||X f || > M e k, for appropriate values of 
K e and M e k, then the inequity (IBTl i holds with e 2 < e. But 
this means the inequality also holds for e, as required. ■ 
Lemma 4: Suppose W v < a, and let e be a positive real 
number. For any given e, there exists a constant K e such that 
if K > K £ , then for ||X t || > M e>K the following holds 



E 



£ Xt +i r> t+ i\H t ] 

> e[(tjc + l)||X t || (JZooC^*) - a - 3^) - e) |W t 

where M e ,K is a sufficiently large constant depending on e 
and X, and i?^ is defined in 

Proof: The essence of the proof in this lemma is finding 
a lower-bound for the percentage of time that near optimal 
values for iVi are used by DCP. We prove that this percentage 
is close to i?oo. First, we place a requirement on ||X t | 
for a given K. Later in the proof, we find an appropriate 
lower-bound K e for K according to the value of e. Note 
that w.p.l, for any given K, tk < (K + 1)(1 + Li)N c . 
Therefore, since departures and arrivals are bounded by D max 
and A max , respectively, we can easily see that for < i < tk, 
||Xt+t — Xj| < C K , where C K is an appropriate constant 
depending on K. Having this inequality, we can find an 
appropriate constant M K , depending on K, such that if 



IXJ >M 



K • 



(52) 



then the following statements hold according to Property Q] 



and Property |2] respectively, with ei < 



Statement 1: For t < t\ < t + tk, t < t% < t + tk, and 
any Nx € Afi, 



(X t2 ,N 1 )\<e 1 . 



(53) 



Statement 2: For any Ti, with < i < K, and any N% € 
A/i, with probability (1 — g v ), and regardless of i and the past 
history at time t + Ti + X, fit+n+i, we have 

\ip r (t + n + x)- 

4>{t + n + 1, N[{t + n + 1)| < e v + ex. (54) 

Similarly, with probability (1 — g v ), and regardless of i and 
the past history at time t + Ti + 1 + A^, rit+ Ti +i+N c , we have 

\ip{t + n + X)- 

0(t + Ti + l + JV c ,JVi(t + Ti + l))| <6 v + ei. (55) 

Remark 1: Property|2]states inequalities in Statement|2]may 
hold in general with different probabilities all not less than 
(1 — g v ). However, to consider the worst case analysis, in 
Statement HI we have assumed these inequalities, with the 
given conditions, hold with the same probability (1 — q v ) for 
all i, where < i < K. 

Remark 2: Consider the i+lth and the j + l t h rounds, where 

< i,j < K and i ^ j. Since inequalities d54l > and (IB3T > in 
Statement |2] may hold in the i + l t h round with probability 

1 — q v regardless of Tit+ Ti +i and H.t+ Ti +i+N c , respectively, 
Statement|2]implies that the event that (l54l or the one that ( |55] > 
holds in the i + l t h round is independent of the inequality (|54| ) 
or (IB3T l holding in the j + X& round. In addition, the event that 
(154-b holds in the i + l t h round is independent of (|53T > holding 
in the same round. 

Before going to the main part of the proof, we first derive 
two key inequalities. To obtain the first one, note that for any 
two time instants t\ and i 2 , with t < t% < t + tk and t < 
ti < t + tk, using (l53l . we have that 



|0(ti,iVi(Xt 1 ))-0(t a ,JV'i(X tl ))| <ei, 



(56) 



and 



\<f>(t 1 ,NiQL ta ))-4>{t2,N 1 VL ta ))\<ei- 
By the definition of Ni (X) and the inequality in d56l l. we have 

^ 1) iV 1 (X tl ))-# 2 ,iV 1 (X t2 )) 

<0(ti,2Vi(X tx ))-0(t2,JV-i(X tl )) <ei 

We can obtain the other direction of the inequality similarly. 
Thus, 



^^(x^-^.iVitXtj)! 



(57) 



This inequality shows that when backlog vector has a large 
absolute value, the optimal cf> does not vary significantly in 
a limited time horizon. In particular, the variation approaches 
zero when ||X t || approaches oo. 

To derive the second key inequality, first note that based 
on the definition of n given in the proof of part(a) of the 
theorem, the i + l t h round after time t begins dt t + Ti + X, and 
the time interval between t + tq + 1 and t + tk + 1 consists 
of K scheduling rounds. To simplify the notation, let N% be 
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the optimal value of N± for the first round after time t, i.e., 
Ni = iVi(X t+ro+ i). In addition, let N[(j) be the candidate 
value for N\ in the j + l t h round, and let Nx(J) be the value 
of Ni used in the update interval of the j + l t h round, i.e., 
= N[(t + Tj + 1), and N x (j) = N^t + Tj + 1). 
Now, consider the i + l t h round, i > 0, and suppose the op- 
timal iVi is selected at this round, i.e., N^(i) = Nx(X-t+n+x)- 
Let N% — iVi(X t+Ti+ i). Then the inequality in (|54l and the 
preceding inequality imply that with probability (1 — g v ) 

\<p r (t + n + 1) - 4>{t + r + 1, N x )\ <2ex + 6 v . (58) 

Let 

e' = 2e 1 +6 (p . (59) 

Based on the assumption Q6 V < a imposed by the Lemma 
and that e% < ^(f — 6 V ), we have 

0<6e'<6((|-^)+^) = a. (60) 

The inequality (l58l is the second key inequality required for 
the rest of the proof. 

We are now in a position to explain the essence of the proof, 
where we find a lower-bound for the fraction of time in the 
horizon of K rounds in which near optimal values for N\ are 
used. Towards this end, we first assume that the inequalities in 
( |54l and (|55T l hold with probability one for all K scheduling 
rounds, thus assuming g v = in Statement [2] We then extend 
our discussion to realistic cases where g v > 0. 

Discussion assuming g v = : Suppose at the i + l t h 
round, i > 1, the optimal N\ corresponding to X t+Ti+ i is 
selected, i.e., N[(i) = Nx = Nx(X- t+n+ x), Considering the 
scheduling policy, with respect to the update of Nx in i + l th 
scheduling round, there are two possible cases: 

Case 1: In this case, we assume ip r (t+Ti + l) > (p(t+Ti-i + 
l) + a. Thus, according to the update rule, Nx gets updated at 
the i + l t h round, and takes the value Nx(i) — N[(i) = Nx- 
However, it remains unchanged until the the K + l t h round. 
We can prove this statement by induction. To see this, assume 
that Ni remains fixed after the i + l t h but changes for the 
first time in the + 1 round, where j > i. Therefore, by the 
update rule, we must have 

(p r (t + tj + 1) > <p(t +Tj-x + 1) + a. (61) 

Since 

W + r 3 + 1, Nl(j)) - 0(i + to + 1, iV[(i))| < ex, 

and 

\<p r (t + r 3 + 1) - <j>{t + t 3 + l,N{{j))\ <9 V + ex, 

which follow from d53l l and (l54l >. respectively, and the assump- 
tion that g v = 0, we have 

^ r {t + Tj + 1) < cj>(t + r + 1, iq(j)) + 2ex + e 9 

<cf>(t + T + l,Nx) + e', (62) 
where the last inequality follows from the definition of Nx. 



Similarly, since by assumption N(j — 1) = Nx, we can use 
(|53l > and ((55]l to show that 

\<p(t + Tj—x + 1) - <f>(t +n + l, Nx)\ < e'. 

Considering this inequality and ( |57] >, we obtain 

tp(t + Tj-x + 1) > <f>(t + r + l,Nx) - e' -ex. (63) 

Finally, considering (loTT l, (|62l , and j63l , we obtain 

4>(t + T Q + l,Nx)+e > 

<f>(t + r + 1, iVi) - e' - ex + a, 

which implies that 2e' + ex > a. This is in contradiction with 
(l6Cfl l stating that 6e' < a. Therefore, Nx(j) = Nx for i < j < 
K — 1, proving the claim. 

A byproduct of the above discussion is that after the i t h 
round, ip(t + Tj + 1) stays close to <j>(t + tq + l,Nx). More 
precisely, since Nx(j) = Nx for i < j < K — 1, we have 

\<p(t + Tj + 1) - 4>{t + Tj + 1, Nx))\ < 

Moreover, from (|53l we have 

\4>(t + Tj + 1, Nx) - ct>(t + n + 1, Nx)\ < ex. 

Using the last two inequalities and d57| > , for i < j < K — 1, 
we obtain 

\ip(t + Tj + l)-<f>(t + T + l,Nx)\ < e' + ex, (64) 

which shows how close is <p(t + Tj + 1) to <j>(t + tq + 1, Nx). 

Case 2: In this case, we assume (p r (t + Ti + 1) < (pit + 
Ti-x + 1) + a. Taking similar steps as in Case 1, we can show 
that 

4>(t + t q + 1, Nx) - e' < f r {t +n + l) 

and 

<p{t + n-x + 1) < 4>(t + r + 1, iVi(i - 1)) + e', 
Hence, using the assumption, we obtain 

4>{t + t + 1, Nx) -2e'-a< 4>{t + r + 1, N x (i - 1)). 

(65) 

We next show that Nx gets updated at most once in the rest of 
K — (i + 1) rounds. Let the j\ + l t h round, for i < jx < K — 1, 
be the first round after the i + 1± round that Nx gets updated. 
Using similar arguments as the ones in Case 1, we have 

<p r {t + Til + 1) < cj>{t + T + 1, Nl(Jx)) + e. 

and 

tp(t + Tj ± —x + 1) > 4>{t + to + 1, Nx(i - 1)) - e', 

where in the above we have used the assumption that Nx does 
not change before the jx + 1 round, and thus have set Nx (jx ~ 
1) = Nx(i — 1). Since Nx gets updated at the jx + lth, we 
have Nx(j) = Nl(j). Using this, the update rule, and the last 
two inequalities, we have 

c/>(t + T + l,Nx(i -l))-e +a 

<cf>(t + T + l,Nx{jx))+e' 
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This inequality and d65l l yield 

<t>{t + t + 1, N x ) - 4e' < cf>(t + r + 1, Nt(ji)). (66) 

Similarly, if there exists j% < ji < K—l, such that at t + Tj 2 + 
1, iVi becomes updated for the second time, we can show that 

0(t + 76 + l,JV 1 (ji))-e / + a 

< ^(i + r + l ) iV 1 (i 2 ))+e / . 

In other words, 

^ + r + l,iV 1 (j 1 ))+a-2e / 

< 0(t + 7b + l,JVi(j 2 )). 

Therefore, every time that iVi becomes updated, the algorithm 
finds a better estimate for cj>(t + tq + 1, Nx). More specifically, 
after each update, the gap between <f>(t +to + 1, Ni(jk)) and 
</>(£ + To + 1, jVi) is decreased by (a — 2e') > |a. However, 
(f65t shows that the initial gap is a + 2e', which is less than 
or equal to |<x Therefore, N± can be updated at most once 
in the rest of K — i — 1 scheduling rounds. 

In this case, similar to what we observed in Case 1, ip(t + 
Tj + 1) stays close to <j)(t + tq + 1, N\). To see this, consider 
a scheduling round, e.g., j + 1± for i < j < K — 1 round, 
where N\{i — 1) is used. By d53l > and (l55t , we have 

\<p(t + ^ + 1) - 0(t + t + 1, Ni(i - 1))| < e'. 

Considering the above inequality and (l65l l. we obtain 

|<p(t + Tj + 1) - 4>{t + t + 1, N x )\ < a + 3e'. (67) 

In the same manner, if instead of Ni(i— 1) an updated version 
of Ni is used in an scheduling round, we can use the inequality 
in d66l l to show that the above inequality still holds. Hence, 
the inequality in (l67l i holds for all j with i < j < K— 1 since, 
as proved earlier, N% becomes updated at most once. 

Combining the inequality (l64t associated with Case 1 
and the inequality (l67l > associated with Case 2, we see that 
regardless of which case happens, the following holds for 
i < 3 < K - 1: 

|Vj(t + r J - + l)-^(t + 7D + l,JVk)| <7, (68) 

where 

7 = cv + 3e'. (69) 

Inspired by the above inequality, we now define a new 
random variable Rk as the percentage of time that "near 
optimal" solution is used in the time horizon consisting of K 
rounds. By near optimal in a scheduling round, e.g., the j'+lth 
round, we mean a choice of N\ that ensures ip(t + tj + 1) is 
close to <fi(t + to + 1,N%) in the sense of (l68l . Intuitively, a 
larger i?x results in a larger scaling factor, and thus, a better 
throughput performance. In the following, using the preceding 
discussions provided in Case 1 and Case 2, we find a lower 
bound for Rk- 

As explained in Section IIV-BI in the beginning of each 
round, e.g., the j + l t h round, the optimal N\, correspond- 
ing to X t+r + i, is chosen independently with probability S. 
Therefore, we see that with probability (1 — Sy^^-S, after the 
first round, the optimal solution is selected for the first time in 



the i + l t h round, i > 1. Suppose this event happens at i + l t h 
round, i > 1. If Case 1 happens, we can partition the time 
interval between t + To + 1 and t + tk + 1 into three sets. The 
first set consists of all test intervals. The second set consists 
of the update intervals before the i + l t h round. Finally, the 
third set consists of the update intervals after the i t h round. 
Considering these sets in sequence, we can express the total 
number of timeslots between t + To + 1 and t + tk + 1 by 

i-l 

KNc + YsNcNzij) 

3=0 

+ ^7V cm in(max(l,^fc^)2^ 1 ,L 1 ), (70) 
i=i 

where N 3 (j) — N 3 (t+Tj + l).To obtain the above expression, 
we have used the fact that when Case 1 happens, according 
to the update rule, at the i + l t h round A^(i) becomes half 
of the previous value for N3, but keeps doubling for each 
following round. Recalling that d68l ) holds after the i t h round, 
and Naij) < L\, we can use (l70t to show that for K > i + 1, 
w.p.l, 

R K > 3 g ■ \ .. (71) 

" K + iLi + 2 + EjLl min(2J- 1 , L x ) 

For a given fixed i, the above fraction approaches 1 + 1 Ll 
as K approaches 00. Therefore, for any given positive £2, 
we can choose K sufficiently large such that for all i with 
1 < i < i max , the above fraction is larger than j^J- — £2- 
Applying a similar argument to the second case, we can find a 
sufficiently large K such that the fraction of time over which 
the near optimal solution is used is larger than 1 + 1 Li — £2- In 
addition, we can select i max such that for a given positive (1 

i max 

£(i-«) i - i *>i-Ci. 

i=l 

Hence, if K is sufficiently large, with probability larger than 
1 — Ci> we nave 

Rk > (-4V - £ 2). (72) 

This is an interesting observation. Since the choices for 62 
and £1 are arbitrary, this observation implies that in the limit of 
large backlog vectors, the policy keeps the network operating 
at near optimal points for at least fraction of time. Hence, 
in the limit, at most only the time for selecting new values for 
Ni and observing their performance is wasted, which con- 
stitues 1+ 1 Li fraction of total time. Recall that near optimality 
is defined in (l68l . and (f>(t + To + 1, Ni) = 0(X t+Tf)+ i), we 
therefore, as a result of the preceding inequality, expect the 
limiting scaling factor of the capacity region to be a function 
of </>(X), and be proportional to jxfc- ■ 

Note that to obtain the above results, in particular those 
mentioned in Case 1 and Case 2, we assume that the inequal- 
ities in d54l i and d55l l hold for all K scheduling rounds after 
time t. Therefore, the above discussion for Rk holds only for 
the limiting case of g v = 0. In the following, we extend the 
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preceding discussions for a realistic situation where g v > 0, 
and obtain a general lower bound for Rk- 
Discussion assuming q v > :We start by assuming that 



|X t || > M K > M' K , 



(73) 



for a sufficiently large M' K such that for a given C, C ^> 1, 
Statement[T]holds for all times t\ and t?. greater than t—1 and 
less than t + T( K +2)c + 1> an d Statement [2] holds for all i with 
< i < {K + 2)C. We partition the time between t + r + 1 
and t + T( K+2 )c + 1 into a set of periods, where each period 
consists of several scheduling rounds. For the simplicity of 
discussion, we assume that the first period always starts at 
t + r + 1. 

Corresponding to each period, e.g. the j± period, we define 
a positive r.v. igj. This r.v. takes value i, i > 0, if the following 
conditions are met. First, in the i + l t h round of the period, 
for the first time in that period the optimal value for N± is 
selected. Second, the inequality (l54t holds for ip r at the i + l t h 
round as well as $55[ for ip at the i t h round, both in the jth 
period. Third, i equals C — 1 if the last two conditions do not 
hold for any of the second to the (C — 2)± rounds in the 
period. Recall that the optimal N\ is chosen independently in 
each round with probability 6, Thus, using Remark [2] with K 
replaced with (K + 2)C, we see that igj becomes a truncated 
geometric r.v. with success probability 



6' = (1 - Qip )H, 



and with the property that 



P{is 



3 = c-i) = i-^5'(i-s'y 



C-2 

£ 

i=l 



(74) 



(75) 



Similarly, corresponding to the period, we define a non- 
negative r.v. denoted by i v j that is zero if igj = C — 1, and 
otherwise, is the number of consecutive rounds immediately 
following the ig.jth round in the period for all of which the 
inequalities in (l54l and ( T55l > hold. Similar to igj, we limit i v j 
to be upper-bounded by C— 1. We do so by letting i v j = C—l 
if for all C—l rounds after the is,jth round (l54l and (l55T l hold. 
Using this definition of and Remark [2] with K replaced 
with (K + 2)C, it is easy to see that 



0\is, : 



C-l) 



0\i S!j £ C - 1) 



1, 



(76) 
(77) 



and 



P(i vd = k\i Slj ?C-1) 

= (1 - Q v f k - l {l - (1 - Q V ) 2 ), 1 < k < C - 2, 

(78) 

and by the boundedness of 

P{i v ,i = C - ljijj ^ C - 1) 

C-2 

= (i-e v ) 2(c - 1) - 1 . (79) 

To complete the characterization of periods, we define the 
last round in the j t h period to be the one immediately following 



the i§ . 



j t h round in the j± period. This indicates that the 



jth period consists of igj 



1 rounds, and thus by the 



definition of i§ j and i^j, its length is always less than 2C. 

Having introduced periods, we now define the sequence 
{pj}JL Q , with pa = 0, as a subset of indices such that r Pj , 
j > 1, is the number of timeslots form time t to the last 
timeslot in the j t h period. By definition, therefore, the jth 



period, j > 1, starts at t + t. 



Pj-i 



1 and ends at t + r. 



Pi 



1. 



Let i/f be number of periods that are completely contained in 
the K rounds after time t, i.e., 



i K = max{j : p 3 < K,j > 0}. 



(80) 



By virtue of the definitions for a scheduling period, ig j, and 
i v j, we can see that for all rounds after igjth and before 
the last round in the j t h period, all conditions to apply the 
discussions in Case 1 and Case 2 are met. Hence, considering 
(|68| |. for 1 < j < %k + 1 and i$j < i < igj + i v j with 
Pj-i + i — 1 < K we have that 



\<p(t + r P] _ 1+t -i + 1) - 4>{t + r + 1)| < 7. 



(81) 



Note that t + r Pj _ 1 +j_i + 1 is the start point of the % round 
in the j t h period, and we have set condition Pj-i + « — 1 < K 
to consider only the first K rounds after time t. 

We now focus on finding a lower bound for Rk- Towards 
that goal, we use r.v.'s igj and igj to define a new sequence 
of 7V3 denoted by N 3 according to the following: 



N 3 (k = Pj . 
Li 



1 



(1 < i < ig tj ) V 

(i = ij 



+ 1) 



= < 



where 



(i = i Slj + 1) A (v.j = 1) 
(i = igj + 1) A j > 1) 



min( ^ ,+i ,Li) (igj + 2<i<ij)/\ > 1) 



l j ~~ l s,j + l <p,j- 

Note that a round after time t can be specified uniquely either 
as the k± round after time t, or as the round in the j± 
period. We thus in the above have defined N 3 as a function of 
the round number k after time t, and also as a function of the 
pair (j,i). Similarly, 7V3 can be considered as a function of 
either K or (j,i). In addition, note that the above definition 
of N 3 is mainly motivated by the method used to obtain d7Tl >. 

To simplify the analysis, we slightly modify the definition 
of Rk such that 



Rk 



TK - TO 



Hence, Rk concerns only the rounds that are within the first 
iK periods, and for which d8TT > holds. Considering the above 
definition, we can use a simple inspection to show that the 
above choices for iV 3 ensure that w.p.l 



Rk > Rk — 



(82) 
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where 



ij + l 



i=l 

= ij- + 1 + + 1)L X 

+ l ( ^ j=1) + (2+ ]T riiin^,^))!^!), (83) 



i=0 



and 



A?(j)= E ^O'.O 

l=t./ . + 1 
■5 ,3 

= l ( ^ J=1) + (2+ E mm^))!^!) 



»=o 



(84) 



As expected, A~ (j) denotes the minimum contribution of the 
jth period to the ratio . The term Xf (j) is the total length 
of the jth period that could potentially minimize the ratio Rk- 
In addition, note that the inequality (l82l in general holds even 
when Remark Q] does not hold, and thus, when the distribution 
of isj and i^j is not given by 1741 -1179). However, as stated 
in Remark [1] we consider the worst case which enables us to 
find a lower-bound for Rk that holds with high probability. 
We next show that the random variable R K is a function of 
i.i.d pairs, and in fact, is the average accumulated reward for 
a renewal process. 

First, note that by definition ig j > 0, and hence, a 



scheduling period, which consists of ig t . 



1 rounds, at 



least contains of two rounds. This implies that the K rounds 
under consideration consitute at most [-yj complete periods. 
Consequently, R K is a function of at most Kp = + 1 
periods, and thus, is completely characterized by 



{(is,j,i v ,j), 1 <j < K P \. 



(85) 



We know that by definition a period consists of at most 2C— 1 
rounds. Therefore, considering Remark|2]with K replaced with 
Kp(2C — 1), we see that the above set is consisted of i.i.d 
pairs, with distribution given by d74li-(|79ll, if Statement|2]holds 
for all i with < i < K P (2C - 1). Recall that we started 
by assuming ||X t || > M' K such that Statement [2] holds for 
< i < (K + 2)C. But this means that Statement |2] holds for 
all i with < i < K P {2C-\) since K P {2C-1) < (K+2)C. 
Therefore, we have that the pairs in (l85l l are i.i.dF^I 

Next, observe that since the pair (Xf (j), ^fu)) depends 



only on (ig t 



the sequence {(Af (j), A<?(j)) : 1 < 3 < 
Kp} also consists of i.i.d. pairs. This sequence is defined for 
1 < i < Kp, but can be defined for j > Kp by defining 
the pair (Ap(j'), Ap(j')), for j > Kp, as an i.i.d. version of 
(Aj (1), A£?(l)). The resulting expanded sequence 

{(Af(j),A?(j)) : j>1} 

13 Note that if C = oo, ig j or i v j may take any finite value. Hence, a 
proper definition of ig j or i^j with distributions given by I74M79) requires 
Statement [2] hold for all % > 0, which cannot be true by assuming ||Xt|| > 
M'x, for any finite value of M'^. 



defines a reward renewal process. For this renewal process, 
Aj (j) is the length of the inter-renewal interval, X^(j) 
is the accumulated reward collected at the end of j t h renewal 
interval, and R K is the average accumulated reward prior to 
end of %k + 1th inter-renewal interval. 

Consider the extended sequence, and let R k , for any k > 0, 
be defined similar to R K . Applying the strong law for the 
renewal process, and noting that ik — > oo, a.s., as fc — > oo, we 
obtain 

Hence, by the almost surely convergence, for any given ep > 
and qr > 0, there exists a sufficiently large n^ R eR such that 



P{ sup 



k>n<? 



\R k 



But since lim. 



Ri 



large C such that 



\R 



c 



i£|<y)>(l-<?fl). (86) 



Rao, we can chose a sufficiently 



-Rool < -y. 



Considering 



P{ sup \R k - Roo 

— e R-en 

The above inequality and 



for this value of C, we have that 

< e R ) > (1 - q r ) 



(87) 



imply that there exists a 
sufficiently large K tR eR such that for K > K tR ea and 

l|X*||>M^ ^ 

P(R K > Roo ~ e R \n t+T0+1 ) > (1- q r ). (88) 

Here, we have stated the probability conditioned on Ht+ To +i 
since all of the previous discussions are valid regardless of 
Ttt+Ta+i- The above inequality states that with probability 
close to one, Rk is close to R^ in the sense that Rk > 
Roo — This is a generalized version of the result obtained 
in d72b . as desired. 

We are finally in a position to derive a lower bound for the 
LHS of the inequality in the lemma, denoted by S. First, note 
that 

E = E[ Yj X t+i D t+ i|W t ] 

!=T + 1 



Pi K TK-Tk-l-N. 

> E[V V X 



k—1 



i=l 



(89) 

where we have simply used the fact that the product X t+ iD t+ ; 
is positive, and neglected the contributions due to the test 
intervals, and also the ones due to the rounds of the last 
partially covered period. 

To simplify the notation, let tj t i t i denote the start of Itn 
timeslot of the ith round in the jth period, i.e., 

tj,i,l = t + t Pj ._ 1+ ;_i + I. 

In addition, let Sjj denote the length of the ith round in the 
jth period, i.e., 



S j,i = 



Pj-l+i 



T Pj-i+i— 1 



N c (l + N 3 (j,i)), 
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where 



N 3 (j,i)=Na(t + T Pj _ 1+i - 1 +l) 



Considering the partition generated by the scheduling peri- 
ods, and the above definitions, we can use (|89T > to show that 

i K ij S 3 ,i-N c 

S ^ E [E E E ^ Nc+l -D t ^ Nc+l \r 

j — 1 i=i$j-\-l l—l 
i K ij Sj.i — Nc 

^[E E E 

j—1 i=i S j-\-l l — l 



IX 



(^ + r + l)-7)|W t 



(90) 



where the last inequality follows from dHTT l. Using ( f57T i and 
assuming 

llX t 



*-tj,i,N c + l I 



>(l-e 3 ), 



we obtain 

IK 



£ > E 



[E E E llXtlRl-eaJ^iJ-ex-^lW*' 



7 — 1 i=i§ ,-+1 i=l 

E[||X t ||(^)-e 1 - 7 ) 

^- £ 3)E E N c N 3 (j,i)\H t 
j=i i=i SJ +i 



(91) 



But, by using ( 1881 1 and adopting a method similar to the one 
in LemmaQ] for K > K eR SR and ||X f || > M' K we can show 
that 



E [E E N c N a (j,i)\H t+T0 + l 

j=l i=i s j+1 

> (1 - £4)(-Roo - £_r)E (t k - r )|7Y i+ro+ i 



, (92) 



where £4 — > 0, as K — » 00. 

Using (|9T| i and d92l i, we obtain 



E > E 



(r A - - t ) || X t || (0(i)-ei-7) 

(i2oo-efl)(l-c3)(l-e4)|Wt 



If we assume 



> 1 - £5, 



TK + 1 

then using the above inequality and the definition of 7, given 
in d69l>, we have that 



{t k + l)n{Roo{4>{t) ~ a - W v ) - e)\H t 



E > E 



where e > 0, and can be made arbitrarily small by choosing 
sufficiently small values for t\, £3, £4, £5, and €r. Note that 
since ||X( — X t+ i|| < C' K , < i < tk, as discussed in 
the beginning of the proof of the lemma, £ 3 can be assumed 
arbitrarily small if ||X t || is sufficiently large. Moreover, since 
T o < (1 + L{)N C and tk + 1 > 2KN C , e 5 can be made 
arbitrarily small by assuming a sufficiently large K. Thus, by 



considering the discussions for and £4, we see that we can 
make cr, £4, and £5 all sufficiently small by choosing K > K e , 
for sufficiently large K e . Having selected K, we can find a 
lower bound M^ K > M K > M' K for ||X t || such that £1 and 
£3 are also sufficiently small. Hence, £ can be arbitrarily small, 
completing the proof of the lemma. ■ 
Lemma 5: Let < 6 < 1. We have 



- S 1 ) > exp ( - (- 



(1-5) (1-6) 2 (1-S 2 ) 



)) >0. 



Proof: First note that by Taylor's theorem, we have 

S 2 s 



ln(l-S) > -(5 + 



2(1 -5f 



Taking In and then exp of the product term in the lemma, and 
using the above inequality, we can easily show that 



JJ(1 - S l ) > exp I 

i=l 

= exp ( - ( 



S 2 



i=l 



(1-5) (l-5) 2 (l-6 2 ) 



))>0, 



proving the lemma. 
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